431 Commits

Author SHA1 Message Date
Yuxuan Zhang
8d90381ba8
Merge pull request #722 from THUDM/main
merge
2025-02-27 17:32:24 +08:00
Yuxuan Zhang
eb66c9c6dc
Merge pull request #709 from LittleNyima/feature/ddim-inversion
Implement DDIM Inversion for CogVideoX
2025-02-27 13:24:24 +08:00
LittleNyima
2c33c0982b fix import order and deprecate for CVX 2B models 2025-02-26 15:54:58 +08:00
LittleNyima
d6bb910697
Merge branch 'THUDM:main' into feature/ddim-inversion 2025-02-26 15:22:08 +08:00
LittleNyima
e0bf395458
make the style of argparser consistent with repo 2025-02-23 19:41:21 +08:00
Yuxuan Zhang
e44c9f2c83
Merge pull request #716 from THUDM/CogVideoX_dev
Update gitignore patterns and project dependencies
2025-02-22 17:09:50 +08:00
OleehyO
5be6c0512f Merge remote-tracking branch 'upstream/CogVideoX_dev' into dev 2025-02-22 06:06:03 +00:00
OleehyO
4dac252c63 [chore] Update gitignore patterns and project dependencies 2025-02-22 06:03:53 +00:00
LittleNyima
250a0bce45 stable version 2025-02-20 05:03:15 +00:00
LittleNyima
58d66c8a08
Implement an unverified version that should be further tested 2025-02-20 01:39:12 +08:00
LittleNyima
dd76b2b9ea Initialize DDIM Inversion script 2025-02-18 09:50:55 +00:00
Yuxuan Zhang
34c6ba22ab
Merge pull request #682 from THUDM/main
Synchronize two branches.
2025-01-22 09:49:34 +08:00
Yuxuan Zhang
bbe909d7f7
Merge pull request #678 from THUDM/CogVideoX_dev
docs: clarify frame number requirements for CogVideoX models
2025-01-22 09:47:24 +08:00
Yuxuan Zhang
ea994c75c2
Merge pull request #652 from erfanasgari21/moviepy-v2
Update code and requirements to support Moviepy v2
2025-01-21 22:29:15 +08:00
Yuxuan Zhang
aa12ed37f5
Merge branch 'main' into moviepy-v2 2025-01-20 21:46:07 +08:00
OleehyO
d9e2a415e8 fix: fix resolution handling for different model types 2025-01-20 09:48:17 +00:00
OleehyO
0e26f54cbe docs: clarify frame number requirements for CogVideoX models
Specify that frame numbers must be:
- 16N + 1 (N <= 10) for CogVideoX1.5-5B models
- 8N + 1 (N <= 6) for CogVideoX-2B/5B models
2025-01-20 09:43:45 +00:00
Yuxuan Zhang
c1ca70ba67
Merge pull request #654 from THUDM/CogVideoX_dev
Support SFT using ZeRO
2025-01-20 11:15:50 +08:00
OleehyO
bf73742c05 docs: enhance CLI demo documentation 2025-01-16 09:34:52 +00:00
OleehyO
bf9c351a10 deps: upgrade diffusers to >=0.32.1 2025-01-16 09:08:44 +00:00
OleehyO
0e78f20629 Merge remote-tracking branch 'upstream/CogVideoX_dev' into dev 2025-01-14 04:00:11 +00:00
Yuxuan Zhang
4615479b51 move to tools 2025-01-14 11:33:02 +08:00
Yuxuan Zhang
7993670957 zero_to_bf16 2025-01-14 11:31:25 +08:00
OleehyO
4878edd0cf fix: correct do_validation argument parsing 2025-01-13 12:48:21 +00:00
Yuxuan Zhang
78275b0480 add comment of bash scripts 2025-01-13 20:02:06 +08:00
OleehyO
455b44a7b5 chore: code cleanup and parameter optimization
- Remove redundant comments and debug information
- Adjust default parameters in training scripts
- Clean up code in lora_trainer and trainer implementations
2025-01-13 11:56:28 +00:00
OleehyO
954ba28d3c Merge remote-tracking branch 'upstream/CogVideoX_dev' into dev 2025-01-13 11:48:24 +00:00
OleehyO
4f1cc66815 fix: correct LoRA loading and resolution dimensions
- Fix LoRA loading by specifying 'transformer' component
- Swap width/height order in RESOLUTION_MAP to match actual usage
2025-01-13 10:49:46 +00:00
zR
1534bf33eb add pipeline 2025-01-12 19:27:21 +08:00
OleehyO
86a0226f80 Merge remote-tracking branch 'upstream/CogVideoX_dev' into dev 2025-01-12 08:52:07 +00:00
OleehyO
70c899f444 chore: update default training configurations 2025-01-12 08:50:15 +00:00
OleehyO
b362663679 fix: normalize image tensors in I2VDataset 2025-01-12 06:01:48 +00:00
OleehyO
30ba1085ff Merge remote-tracking branch 'upstream/main' into dev 2025-01-12 05:58:07 +00:00
OleehyO
3252614569 Add pydantic dependency 2025-01-12 05:56:24 +00:00
OleehyO
f66f1647e2
Merge pull request #657 from ZGCTroy/main
fix bug of i2v finetune
2025-01-12 13:55:12 +08:00
OleehyO
f5169385bd docs: add SFT support documentation in multilingual README 2025-01-12 05:53:13 +00:00
OleehyO
795dd144a4 Rename lora training scripts as ddp 2025-01-12 05:36:32 +00:00
OleehyO
fdb9820949 feat: support DeepSpeed ZeRO-3 and optimize peak memory usage
- Add DeepSpeed ZeRO-3 configuration support
- Optimize memory usage during training
- Rename training scripts to reflect ZeRO usage
- Update related configuration files and trainers
2025-01-12 05:33:56 +00:00
Zheng Guang Cong
09a49d3546
fix bug of i2v; video is already 0-255
video is already 0-255 and should not be multiplied 255 any more
2025-01-11 17:29:27 +08:00
Zheng Guang Cong
cd861bbe1e
Update i2v_dataset.py
image should also be transformed to [-1, 1]
2025-01-11 17:24:35 +08:00
Zheng Guang Cong
35383e2db3
fix potential bug of i2v
Image value is in [0, 255] and should be transformed into [-1, 1], similar to video.
2025-01-11 17:08:25 +08:00
zR
7dc8516bcb add comment as #653 2025-01-11 12:53:32 +08:00
OleehyO
2f275e82b5 Merge remote-tracking branch 'upstream/CogVideoX_dev' into dev 2025-01-11 02:16:09 +00:00
OleehyO
caa24bdc36 feat: add SFT support with ZeRO optimization strategies
- Add SFT (Supervised Fine-Tuning) trainers for all model variants:
  - CogVideoX I2V and T2V
  - CogVideoX-1.5 I2V and T2V
- Add DeepSpeed ZeRO configuration files:
  - ZeRO-2 with and without CPU offload
  - ZeRO-3 with and without CPU offload
- Add base accelerate config for distributed training
- Update trainer.py to support SFT training mode

This enables full-parameter fine-tuning with memory-efficient distributed training using DeepSpeed ZeRO optimization.
2025-01-11 02:13:32 +00:00
OleehyO
e213b6c083 fix: pad latent frames to match patch_size_t requirements 2025-01-11 02:08:07 +00:00
Erfan Asgari
70ca65300c
upgrade to moviepy v2 2025-01-11 00:18:24 +03:30
OleehyO
f6d722cec7 fix: remove copying first video frame as conditioning image 2025-01-09 15:52:51 +00:00
OleehyO
07766001f6 feat(dataset): pad short videos by repeating last frame
When loading videos with fewer frames than max_num_frames, repeat the last
frame to reach the required length instead of failing. This ensures consistent
tensor dimensions across the dataset while preserving as much original video
content as possible.
2025-01-08 02:14:56 +00:00
Yuxuan Zhang
8f1829f1cd
Merge pull request #642 from THUDM/CogVideoX_dev
New Lora 20250108
2025-01-08 09:51:39 +08:00
zR
045e1b308b readme 2025-01-08 09:50:08 +08:00