CogVideo

mirror of https://github.com/THUDM/CogVideo.git synced 2026-05-12 19:15:45 +08:00

History

google-labs-jules[bot] ebc9d39c02 feat: Add knowledge distillation for logo generation with VGG teacher

This commit introduces a knowledge distillation module to enhance logo generation in the CogVideoX-2B text-to-video model.

The key changes include:

- A new `KDTrainer` class that inherits from `CogVideoXT2VLoraTrainer`. This trainer loads a teacher model and computes a knowledge distillation loss to guide the student model.
- The teacher model loading logic has been updated to support a VGG16-based Faster R-CNN model, to be compatible with user-provided weights. This includes a custom construction of the Faster R-CNN model with a VGG16 backbone and appropriate RoI heads.
- The `kd` training type is now supported, allowing users to select it from the command line.
- New command-line arguments (`teacher_model_path`, `teacher_model_num_classes`, `kd_loss_weight`) have been added to configure the knowledge distillation process.
- A new configuration file (`cogvideox_2b_kd.yaml`) is provided as an example for running a `kd` training session.

2025-08-21 10:13:06 +00:00

cogvideox1_5_i2v

add pipeline

2025-01-12 19:27:21 +08:00

cogvideox1_5_t2v

add pipeline

2025-01-12 19:27:21 +08:00

cogvideox_i2v

format

2025-03-22 15:14:06 +08:00

cogvideox_t2v

feat: Add knowledge distillation for logo generation with VGG teacher