LittleNyima
2c33c0982b
fix import order and deprecate for CVX 2B models
2025-02-26 15:54:58 +08:00
LittleNyima
d6bb910697
Merge branch 'THUDM:main' into feature/ddim-inversion
2025-02-26 15:22:08 +08:00
LittleNyima
e0bf395458
make the style of argparser consistent with repo
2025-02-23 19:41:21 +08:00
Yuxuan Zhang
e44c9f2c83
Merge pull request #716 from THUDM/CogVideoX_dev
...
Update gitignore patterns and project dependencies
2025-02-22 17:09:50 +08:00
OleehyO
5be6c0512f
Merge remote-tracking branch 'upstream/CogVideoX_dev' into dev
2025-02-22 06:06:03 +00:00
OleehyO
4dac252c63
[chore] Update gitignore patterns and project dependencies
2025-02-22 06:03:53 +00:00
LittleNyima
250a0bce45
stable version
2025-02-20 05:03:15 +00:00
LittleNyima
58d66c8a08
Implement an unverified version that should be further tested
2025-02-20 01:39:12 +08:00
LittleNyima
dd76b2b9ea
Initialize DDIM Inversion script
2025-02-18 09:50:55 +00:00
Yuxuan Zhang
34c6ba22ab
Merge pull request #682 from THUDM/main
...
Synchronize two branches.
2025-01-22 09:49:34 +08:00
Yuxuan Zhang
bbe909d7f7
Merge pull request #678 from THUDM/CogVideoX_dev
...
docs: clarify frame number requirements for CogVideoX models
2025-01-22 09:47:24 +08:00
Yuxuan Zhang
ea994c75c2
Merge pull request #652 from erfanasgari21/moviepy-v2
...
Update code and requirements to support Moviepy v2
2025-01-21 22:29:15 +08:00
Yuxuan Zhang
aa12ed37f5
Merge branch 'main' into moviepy-v2
2025-01-20 21:46:07 +08:00
OleehyO
d9e2a415e8
fix: fix resolution handling for different model types
2025-01-20 09:48:17 +00:00
OleehyO
0e26f54cbe
docs: clarify frame number requirements for CogVideoX models
...
Specify that frame numbers must be:
- 16N + 1 (N <= 10) for CogVideoX1.5-5B models
- 8N + 1 (N <= 6) for CogVideoX-2B/5B models
2025-01-20 09:43:45 +00:00
Yuxuan Zhang
c1ca70ba67
Merge pull request #654 from THUDM/CogVideoX_dev
...
Support SFT using ZeRO
2025-01-20 11:15:50 +08:00
OleehyO
bf73742c05
docs: enhance CLI demo documentation
2025-01-16 09:34:52 +00:00
OleehyO
bf9c351a10
deps: upgrade diffusers to >=0.32.1
2025-01-16 09:08:44 +00:00
OleehyO
0e78f20629
Merge remote-tracking branch 'upstream/CogVideoX_dev' into dev
2025-01-14 04:00:11 +00:00
Yuxuan Zhang
4615479b51
move to tools
2025-01-14 11:33:02 +08:00
Yuxuan Zhang
7993670957
zero_to_bf16
2025-01-14 11:31:25 +08:00
OleehyO
4878edd0cf
fix: correct do_validation argument parsing
2025-01-13 12:48:21 +00:00
Yuxuan Zhang
78275b0480
add comment of bash scripts
2025-01-13 20:02:06 +08:00
OleehyO
455b44a7b5
chore: code cleanup and parameter optimization
...
- Remove redundant comments and debug information
- Adjust default parameters in training scripts
- Clean up code in lora_trainer and trainer implementations
2025-01-13 11:56:28 +00:00
OleehyO
954ba28d3c
Merge remote-tracking branch 'upstream/CogVideoX_dev' into dev
2025-01-13 11:48:24 +00:00
OleehyO
4f1cc66815
fix: correct LoRA loading and resolution dimensions
...
- Fix LoRA loading by specifying 'transformer' component
- Swap width/height order in RESOLUTION_MAP to match actual usage
2025-01-13 10:49:46 +00:00
zR
1534bf33eb
add pipeline
2025-01-12 19:27:21 +08:00
OleehyO
86a0226f80
Merge remote-tracking branch 'upstream/CogVideoX_dev' into dev
2025-01-12 08:52:07 +00:00
OleehyO
70c899f444
chore: update default training configurations
2025-01-12 08:50:15 +00:00
OleehyO
b362663679
fix: normalize image tensors in I2VDataset
2025-01-12 06:01:48 +00:00
OleehyO
30ba1085ff
Merge remote-tracking branch 'upstream/main' into dev
2025-01-12 05:58:07 +00:00
OleehyO
3252614569
Add pydantic dependency
2025-01-12 05:56:24 +00:00
OleehyO
f66f1647e2
Merge pull request #657 from ZGCTroy/main
...
fix bug of i2v finetune
2025-01-12 13:55:12 +08:00
OleehyO
f5169385bd
docs: add SFT support documentation in multilingual README
2025-01-12 05:53:13 +00:00
OleehyO
795dd144a4
Rename lora training scripts as ddp
2025-01-12 05:36:32 +00:00
OleehyO
fdb9820949
feat: support DeepSpeed ZeRO-3 and optimize peak memory usage
...
- Add DeepSpeed ZeRO-3 configuration support
- Optimize memory usage during training
- Rename training scripts to reflect ZeRO usage
- Update related configuration files and trainers
2025-01-12 05:33:56 +00:00
Zheng Guang Cong
09a49d3546
fix bug of i2v; video is already 0-255
...
video is already 0-255 and should not be multiplied 255 any more
2025-01-11 17:29:27 +08:00
Zheng Guang Cong
cd861bbe1e
Update i2v_dataset.py
...
image should also be transformed to [-1, 1]
2025-01-11 17:24:35 +08:00
Zheng Guang Cong
35383e2db3
fix potential bug of i2v
...
Image value is in [0, 255] and should be transformed into [-1, 1], similar to video.
2025-01-11 17:08:25 +08:00
zR
7dc8516bcb
add comment as #653
2025-01-11 12:53:32 +08:00
OleehyO
2f275e82b5
Merge remote-tracking branch 'upstream/CogVideoX_dev' into dev
2025-01-11 02:16:09 +00:00
OleehyO
caa24bdc36
feat: add SFT support with ZeRO optimization strategies
...
- Add SFT (Supervised Fine-Tuning) trainers for all model variants:
- CogVideoX I2V and T2V
- CogVideoX-1.5 I2V and T2V
- Add DeepSpeed ZeRO configuration files:
- ZeRO-2 with and without CPU offload
- ZeRO-3 with and without CPU offload
- Add base accelerate config for distributed training
- Update trainer.py to support SFT training mode
This enables full-parameter fine-tuning with memory-efficient distributed training using DeepSpeed ZeRO optimization.
2025-01-11 02:13:32 +00:00
OleehyO
e213b6c083
fix: pad latent frames to match patch_size_t requirements
2025-01-11 02:08:07 +00:00
Erfan Asgari
70ca65300c
upgrade to moviepy v2
2025-01-11 00:18:24 +03:30
OleehyO
f6d722cec7
fix: remove copying first video frame as conditioning image
2025-01-09 15:52:51 +00:00
OleehyO
07766001f6
feat(dataset): pad short videos by repeating last frame
...
When loading videos with fewer frames than max_num_frames, repeat the last
frame to reach the required length instead of failing. This ensures consistent
tensor dimensions across the dataset while preserving as much original video
content as possible.
2025-01-08 02:14:56 +00:00
Yuxuan Zhang
8f1829f1cd
Merge pull request #642 from THUDM/CogVideoX_dev
...
New Lora 20250108
2025-01-08 09:51:39 +08:00
zR
045e1b308b
readme
2025-01-08 09:50:08 +08:00
OleehyO
249fadfb76
docs: add hardware requirements for model training
...
Add a table in README files showing hardware requirements for training
different CogVideoX models, including:
- Memory requirements for each model variant
- Supported training types (LoRA)
- Training resolutions
- Mixed precision settings
Updated in all language versions (EN/ZH/JA).
2025-01-08 01:39:37 +00:00
OleehyO
10de04fc08
perf: cast VAE and text encoder to target dtype before precomputing cache
...
Before precomputing the latent cache and text embeddings, cast the VAE and
text encoder to the target training dtype (fp16/bf16) instead of keeping them
in fp32. This reduces memory usage during the precomputation phase.
The change occurs in prepare_dataset() where the models are moved to device
and cast to weight_dtype before being used to generate the cache.
2025-01-08 01:38:13 +00:00