CogVideo

mirror of https://github.com/THUDM/CogVideo.git synced 2025-11-20 09:52:09 +08:00

Author	SHA1	Message	Date
OleehyO	b362663679	fix: normalize image tensors in I2VDataset	2025-01-12 06:01:48 +00:00
OleehyO	30ba1085ff	Merge remote-tracking branch 'upstream/main' into dev	2025-01-12 05:58:07 +00:00
Zheng Guang Cong	cd861bbe1e	Update i2v_dataset.py image should also be transformed to [-1, 1]	2025-01-11 17:24:35 +08:00
Zheng Guang Cong	35383e2db3	fix potential bug of i2v Image value is in [0, 255] and should be transformed into [-1, 1], similar to video.	2025-01-11 17:08:25 +08:00
OleehyO	f6d722cec7	fix: remove copying first video frame as conditioning image	2025-01-09 15:52:51 +00:00
OleehyO	07766001f6	feat(dataset): pad short videos by repeating last frame When loading videos with fewer frames than max_num_frames, repeat the last frame to reach the required length instead of failing. This ensures consistent tensor dimensions across the dataset while preserving as much original video content as possible.	2025-01-08 02:14:56 +00:00
OleehyO	392e37021a	Add video path to error message for better debugging	2025-01-07 09:50:21 +00:00
OleehyO	e084a4a270	feat: auto-extract first frames as conditioning images for i2v model When training i2v models without specifying image_column, automatically extract and use first frames from training videos as conditioning images. This includes: - Add load_images_from_videos() utility function to extract and cache first frames - Update BaseI2VDataset to support auto-extraction when image_column is None - Add validation and warning message in Args schema for i2v without image_column The first frames are extracted once and cached to avoid repeated video loading.	2025-01-07 06:43:26 +00:00
OleehyO	36427274d6	style: format import statements across finetune module	2025-01-07 05:54:52 +00:00
zR	1789f07256	format and check fp16 for cogvideox2b	2025-01-07 13:16:18 +08:00
OleehyO	e5b8f9a2ee	feat: add caching for prompt embeddings - Add caching for prompt embeddings - Store cached files using safetensors format - Add cache directory structure under data_root/cache - Optimize memory usage by moving tensors to CPU after caching - Add debug logging for cache hits - Add info logging for cache writes The caching system helps reduce redundant computation and memory usage during training by: 1. Caching prompt embeddings based on prompt text hash 2. Caching encoded video latents based on video filename 3. Moving tensors to CPU after caching to free GPU memory	2025-01-04 06:16:31 +00:00
OleehyO	6eae5c201e	feat: add latent caching for video encodings - Add caching mechanism to store VAE-encoded video latents to disk - Cache latents in a "latent" subdirectory alongside video files - Skip re-encoding when cached latent file exists - Add logging for successful cache saves - Minor code cleanup and formatting improvements This change improves training efficiency by avoiding redundant video encoding operations.	2025-01-01 15:10:42 +00:00
OleehyO	2a6cca0656	Add type conversion and validation checks	2025-01-01 15:10:42 +00:00
OleehyO	918ebb5a54	feat(datasets): implement video dataset modules - Add dataset implementations for text-to-video and image-to-video - Include bucket sampler for efficient batch processing - Add utility functions for data processing - Create dataset package structure with proper initialization	2025-01-01 15:10:40 +00:00

14 Commits