CogVideo

mirror of https://github.com/THUDM/CogVideo.git synced 2025-11-20 01:44:21 +08:00

Author	SHA1	Message	Date
OleehyO	f731c35f70	Add unload_model function	2025-01-03 08:21:27 +00:00
OleehyO	a88c1ede69	feat(args): add validation for training resolution - Add validation check to ensure number of frames is multiple of 8 - Add format validation for train_resolution string (frames x height x width)	2025-01-02 03:12:09 +00:00
OleehyO	362b7bf273	docs: update README in multiple languages	2025-01-02 03:07:34 +00:00
OleehyO	7fa1bb48be	refactor: remove deprecated training scripts	2025-01-01 15:56:14 +00:00
OleehyO	48ad178818	Reorganize training script arguments	2025-01-01 15:52:39 +00:00
OleehyO	6e79472417	feat: add training launch scripts for I2V and T2V models Add two shell scripts to simplify model training: - accelerate_train_i2v.sh: Launch script for Image-to-Video training - accelerate_train_t2v.sh: Launch script for Text-to-Video training Both scripts provide comprehensive configurations for: - Model settings - Data pipeline - Training parameters - System resources - Checkpointing - Validation	2025-01-01 15:10:55 +00:00
OleehyO	26b87cd4ff	feat(args): add validation and arg interface for training parameters - Add field validators for model type and validation settings - Implement command line argument parsing with argparse - Add type hints and documentation for training parameters - Support configuration of model, training, and validation parameters	2025-01-01 15:10:55 +00:00
OleehyO	04a60e7435	Change logger name to trainer	2025-01-01 15:10:55 +00:00
OleehyO	a001842834	feat: implement CogVideoX trainers for I2V and T2V tasks Add and refactor trainers for CogVideoX model variants: - Implement CogVideoXT2VLoraTrainer for text-to-video generation - Refactor CogVideoXI2VLoraTrainer for image-to-video generation Both trainers support LoRA fine-tuning with proper handling of: - Model components loading and initialization - Video encoding and batch collation - Loss computation with noise prediction - Validation step for generation	2025-01-01 15:10:54 +00:00
OleehyO	91d79fd9a4	feat: add schemas module for configuration and state management Add Pydantic models to handle: - CLI arguments and configuration (Args) - Model components and pipeline (Components) - Training state and parameters (State)	2025-01-01 15:10:54 +00:00
OleehyO	45d40450a1	refactor: simplify dataset implementation and add latent precomputation - Replace bucket-based dataset with simpler resize-based implementation - Add video latent precomputation during dataset initialization - Improve code readability and user experience - Remove complexity of bucket sampling for better maintainability This change makes the codebase more straightforward and easier to use while maintaining functionality through resize-based video processing.	2025-01-01 15:10:54 +00:00
OleehyO	6eae5c201e	feat: add latent caching for video encodings - Add caching mechanism to store VAE-encoded video latents to disk - Cache latents in a "latent" subdirectory alongside video files - Skip re-encoding when cached latent file exists - Add logging for successful cache saves - Minor code cleanup and formatting improvements This change improves training efficiency by avoiding redundant video encoding operations.	2025-01-01 15:10:42 +00:00
OleehyO	2a6cca0656	Add type conversion and validation checks	2025-01-01 15:10:42 +00:00
OleehyO	fa4659fb2c	feat(trainer): add validation functionality to Trainer class Add validation capabilities to the Trainer class including: - Support for validating images and videos during training - Periodic validation based on validation_steps parameter - Artifact logging to wandb for validation results - Memory tracking during validation process	2025-01-01 15:10:41 +00:00
OleehyO	6971364591	Export file_utils.py	2025-01-01 15:10:41 +00:00
OleehyO	60f6a3d7ee	feat: add base trainer implementation and training script - Add Trainer base class with core training loop functionality - Implement distributed training setup with Accelerate - Add training script with model/trainer initialization - Support LoRA fine-tuning with checkpointing and validation	2025-01-01 15:10:41 +00:00
OleehyO	a505f2e312	Add constants.py	2025-01-01 15:10:40 +00:00
OleehyO	78f655a9a4	Add utils	2025-01-01 15:10:40 +00:00
OleehyO	85e00a1082	feat(models): add scaffolding	2025-01-01 15:10:40 +00:00
OleehyO	918ebb5a54	feat(datasets): implement video dataset modules - Add dataset implementations for text-to-video and image-to-video - Include bucket sampler for efficient batch processing - Add utility functions for data processing - Create dataset package structure with proper initialization	2025-01-01 15:10:40 +00:00
OleehyO	e3f6def234	feat: add video frame extraction tool Add utility script to extract first frames from videos, helping users convert T2V datasets to I2V format	2025-01-01 15:10:39 +00:00
OleehyO	7b282246dd	chore: remove unused configuration files after refactoring Delete accelerate configs, deepspeed config and host file that are no longer needed	2025-01-01 15:10:39 +00:00
Gforky	48ac9c1066	[fix]fix typo in train_cogvideox_image_to_video_lora.py	2025-01-01 15:10:30 +00:00
Zheng Guang Cong	21693ca770	fix bugs of image-to-video without image-condition	2025-01-01 15:10:30 +00:00
OleehyO	7b4c9db6d9	Fix for CogVideoX-{2B,5B} When loading CogVideX-{2B,5B}, `patch_size_t` is None, which results in the `prepare_rotary_position_embeddings` function.	2024-12-13 04:02:27 +00:00
OleehyO	36f1333788	Fix for deepspeed training	2024-12-13 04:02:26 +00:00
OleehyO	4d1b9fd166	Fix for Disney video dataset	2024-12-13 04:02:21 +00:00
Yuxuan.Zhang	5aa6d3a9ee	Merge pull request #515 from Gforky/fix_finetune_demo [fix]fix deepspeed initialization issue in finetune examples	2024-12-02 11:29:42 +08:00
spacegoing	2fb763d25f	[Fix] fix rope temporal patch size	2024-11-21 16:26:45 +00:00
luwen.miao	ac2f2c78f7	[fix]fix deepspeed initialization issue in finetune examples	2024-11-18 09:49:31 +00:00
zR	e6ee283d0e	Merge branch 'CogVideoX_dev' of github.com:THUDM/CogVideo into CogVideoX_dev	2024-10-14 11:34:40 +08:00
zR	e169e7b045	Update train_cogvideox_image_to_video_lora.py	2024-10-06 22:50:56 +08:00
Yuxuan.Zhang	532f246d7c	Merge pull request #389 from THUDM/CogVideoX_dev I2V Finetune of CogVIdeoX-5B-I2V	2024-10-05 22:14:52 +08:00
zR	f28708d845	Update train_cogvideox_image_to_video_lora.py	2024-10-05 22:12:22 +08:00
zR	4339f65660	update	2024-10-05 01:05:27 +08:00
LittleNyima	a59ed84b52	fix deprecation of clear_objs_and_retain_memory	2024-09-30 20:45:53 +08:00
zR	d9e75ce3f5	horse	2024-09-28 16:36:40 +08:00
zR	fbfad9c361	readme change	2024-09-20 17:18:31 +08:00
zR	1f6d9032cc	test finetune	2024-09-20 16:22:58 +08:00
zR	2db0453b96	finetune requirement change	2024-09-20 13:35:03 +08:00
zR	66369a90aa	update of readme and hostfile	2024-09-17 23:42:35 +08:00
zR	db309f3242	update llm_cogvideox_flux demo test	2024-09-17 23:15:19 +08:00
zR	6e64359524	finetune and infer upload	2024-09-16 12:02:27 +08:00

43 Commits