CogVideo

mirror of https://github.com/THUDM/CogVideo.git synced 2026-01-08 22:19:06 +08:00

Author	SHA1	Message	Date
OleehyO	04a60e7435	Change logger name to trainer	2025-01-01 15:10:55 +00:00
OleehyO	a001842834	feat: implement CogVideoX trainers for I2V and T2V tasks Add and refactor trainers for CogVideoX model variants: - Implement CogVideoXT2VLoraTrainer for text-to-video generation - Refactor CogVideoXI2VLoraTrainer for image-to-video generation Both trainers support LoRA fine-tuning with proper handling of: - Model components loading and initialization - Video encoding and batch collation - Loss computation with noise prediction - Validation step for generation	2025-01-01 15:10:54 +00:00
OleehyO	91d79fd9a4	feat: add schemas module for configuration and state management Add Pydantic models to handle: - CLI arguments and configuration (Args) - Model components and pipeline (Components) - Training state and parameters (State)	2025-01-01 15:10:54 +00:00
OleehyO	45d40450a1	refactor: simplify dataset implementation and add latent precomputation - Replace bucket-based dataset with simpler resize-based implementation - Add video latent precomputation during dataset initialization - Improve code readability and user experience - Remove complexity of bucket sampling for better maintainability This change makes the codebase more straightforward and easier to use while maintaining functionality through resize-based video processing.	2025-01-01 15:10:54 +00:00
OleehyO	6eae5c201e	feat: add latent caching for video encodings - Add caching mechanism to store VAE-encoded video latents to disk - Cache latents in a "latent" subdirectory alongside video files - Skip re-encoding when cached latent file exists - Add logging for successful cache saves - Minor code cleanup and formatting improvements This change improves training efficiency by avoiding redundant video encoding operations.	2025-01-01 15:10:42 +00:00
OleehyO	2a6cca0656	Add type conversion and validation checks	2025-01-01 15:10:42 +00:00
OleehyO	fa4659fb2c	feat(trainer): add validation functionality to Trainer class Add validation capabilities to the Trainer class including: - Support for validating images and videos during training - Periodic validation based on validation_steps parameter - Artifact logging to wandb for validation results - Memory tracking during validation process	2025-01-01 15:10:41 +00:00
OleehyO	6971364591	Export file_utils.py	2025-01-01 15:10:41 +00:00
OleehyO	60f6a3d7ee	feat: add base trainer implementation and training script - Add Trainer base class with core training loop functionality - Implement distributed training setup with Accelerate - Add training script with model/trainer initialization - Support LoRA fine-tuning with checkpointing and validation	2025-01-01 15:10:41 +00:00
OleehyO	a505f2e312	Add constants.py	2025-01-01 15:10:40 +00:00
OleehyO	78f655a9a4	Add utils	2025-01-01 15:10:40 +00:00
OleehyO	85e00a1082	feat(models): add scaffolding	2025-01-01 15:10:40 +00:00
OleehyO	918ebb5a54	feat(datasets): implement video dataset modules - Add dataset implementations for text-to-video and image-to-video - Include bucket sampler for efficient batch processing - Add utility functions for data processing - Create dataset package structure with proper initialization	2025-01-01 15:10:40 +00:00
OleehyO	e3f6def234	feat: add video frame extraction tool Add utility script to extract first frames from videos, helping users convert T2V datasets to I2V format	2025-01-01 15:10:39 +00:00
OleehyO	7b282246dd	chore: remove unused configuration files after refactoring Delete accelerate configs, deepspeed config and host file that are no longer needed	2025-01-01 15:10:39 +00:00
OleehyO	5cb9303286	chore: update .gitignore - Add new ignore patterns for dataset and model directories - Update rules for development files	2025-01-01 15:10:32 +00:00
OleehyO	ba85627577	[docs] improve help messages in argument parser Fix and clarify help documentation in parser.add_argument() to better describe command-line arguments.	2025-01-01 15:10:31 +00:00
OleehyO	2508c8353b	[bugfix] fix specific resolution setting Different models use different resolutions, for example, for the CogVideoX1.5 series models, the optimal generation resolution is 1360x768， But for CogVideoX, the best resolution is 720x480.	2025-01-01 15:10:31 +00:00
Gforky	48ac9c1066	[fix]fix typo in train_cogvideox_image_to_video_lora.py	2025-01-01 15:10:30 +00:00
Zheng Guang Cong	21693ca770	fix bugs of image-to-video without image-condition	2025-01-01 15:10:30 +00:00
OleehyO	d3a7d2dc91	Add resolution warning	2024-12-16 11:34:51 +00:00
OleehyO	7b4c9db6d9	Fix for CogVideoX-{2B,5B} When loading CogVideX-{2B,5B}, `patch_size_t` is None, which results in the `prepare_rotary_position_embeddings` function.	2024-12-13 04:02:27 +00:00
OleehyO	36f1333788	Fix for deepspeed training	2024-12-13 04:02:26 +00:00
OleehyO	4d1b9fd166	Fix for Disney video dataset	2024-12-13 04:02:21 +00:00
OleehyO	3ff9d3049d	docs: change "read this in English" to "中文阅读" Update README.md to use Chinese text for language switch link	2024-12-11 05:10:28 +00:00
Yuxuan.Zhang	87ccd38cea	Merge pull request #567 from THUDM/main New Finetune	2024-12-02 11:30:20 +08:00
Yuxuan.Zhang	5aa6d3a9ee	Merge pull request #515 from Gforky/fix_finetune_demo [fix]fix deepspeed initialization issue in finetune examples	2024-12-02 11:29:42 +08:00
Yuxuan.Zhang	a094b34425	Merge pull request #565 from THUDM/CogVideoX_dev Cog video x dev	2024-11-30 12:45:25 +08:00
zR	0fe46df21f	new jobs of friendly link	2024-11-30 12:40:07 +08:00
Yuxuan.Zhang	f1a2b48974	Merge pull request #556 from THUDM/main new announced	2024-11-27 12:11:12 +08:00
Yuxuan.Zhang	d82922cc79	Merge pull request #538 from spacegoing/fix_rope_finetune_shape [Fix] fix rope temporal patch size	2024-11-23 21:24:39 +08:00
spacegoing	2fb763d25f	[Fix] fix rope temporal patch size	2024-11-21 16:26:45 +00:00
luwen.miao	ac2f2c78f7	[fix]fix deepspeed initialization issue in finetune examples	2024-11-18 09:49:31 +00:00
Yuxuan.Zhang	2fdc59c3ce	Merge pull request #507 from THUDM/CogVideoX_dev diffusers version	2024-11-17 21:54:47 +08:00
zR	17996f11f8	update	2024-11-16 10:06:22 +08:00
Yuxuan.Zhang	5e3e3aabe0	Merge pull request #500 from THUDM/main Merge	2024-11-13 21:15:49 +08:00
zR	e7a35ea33b	update friendly link	2024-11-13 17:06:16 +08:00
zR	cd5ceca22b	fix resolution docs	2024-11-12 00:41:23 +08:00
zR	bb2cb130a0	add width and height	2024-11-12 00:17:19 +08:00
zR	2151a3bdfb	update with diffusers	2024-11-11 22:41:28 +08:00
zR	68d93ce8fc	fix	2024-11-09 22:51:39 +08:00
zR	155456befa	update	2024-11-09 22:49:03 +08:00
zR	2475902027	friendly link	2024-11-09 22:43:02 +08:00
zR	fb806eecce	update table	2024-11-09 22:29:36 +08:00
zR	c8c7b62aa1	update diffusers code	2024-11-09 22:07:32 +08:00
Yuxuan.Zhang	e2987ff565	Merge pull request #474 from THUDM/CogVideoX_dev Fix #472 #473	2024-11-09 00:18:01 +08:00
zR	a8205b575d	Update cp_enc_dec.py	2024-11-08 23:27:44 +08:00
zR	e7bcecf947	remove wrong fake_cp	2024-11-08 22:54:17 +08:00
zR	d8ee013842	add 10 second comment	2024-11-08 22:31:39 +08:00
zR	e43a7645fd	Update autoencoder.py	2024-11-08 21:49:02 +08:00

1 2 3 4 5 ...

338 Commits