diff --git a/README.md b/README.md index 3d6f31e..c35c617 100644 --- a/README.md +++ b/README.md @@ -57,18 +57,19 @@ to [清影](https://chatglm.cn/video). The table below shows the list of video generation models we currently provide, along with related basic information: -| Model Name | CogVideoX-2B | -|-------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------| -| Prompt Language | English | -| GPU Memory Required for Inference (FP16) | 18GB if using [SAT](https://github.com/THUDM/SwissArmyTransformer); 36GB if using diffusers (will be optimized before the PR is merged) | -| GPU Memory Required for Fine-tuning(bs=1) | 40GB | -| Prompt Max Length | 226 Tokens | -| Video Length | 6 seconds | -| Frames Per Second | 8 frames | -| Resolution | 720 * 480 | -| Quantized Inference | Not Supported | -| Multi-card Inference | Not Supported | -| Download Link | 🤗 [CogVideoX-2B](https://huggingface.co/THUDM/CogVideoX-2B) | +| Model Name | CogVideoX-2B | +|-------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Prompt Language | English | +| GPU Memory Required for Inference (FP16) | 18GB if using [SAT](https://github.com/THUDM/SwissArmyTransformer); 36GB if using diffusers (will be optimized before the PR is merged) | +| GPU Memory Required for Fine-tuning(bs=1) | 40GB | +| Prompt Max Length | 226 Tokens | +| Video Length | 6 seconds | +| Frames Per Second | 8 frames | +| Resolution | 720 * 480 | +| Quantized Inference | Not Supported | +| Multi-card Inference | Not Supported | +| Download Link (HF diffusers Model) | 🤗 [Huggingface](https://huggingface.co/THUDM/CogVideoX-2B) [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/CogVideoX-2b) [💫 WiseModel](https://wisemodel.cn/models/ZhipuAI/CogVideoX-2b) | +| Download Link (SAT Model) | [SAT](./sat/README.md) | ## Project Structure @@ -120,7 +121,7 @@ We welcome your contributions. You can click [here](resources/contribute.md) for The code in this repository is released under the [Apache 2.0 License](LICENSE). -The model weights and implementation code are released under the [CogVideoX LICENSE](Model_License). +The model weights and implementation code are released under the [CogVideoX LICENSE](MODEL_LICENSE). ## Citation diff --git a/README_zh.md b/README_zh.md index b0ef941..bd2c47d 100644 --- a/README_zh.md +++ b/README_zh.md @@ -54,18 +54,19 @@ CogVideoX是 [清影](https://chatglm.cn/video) 同源的开源版本视频生 下表战展示目前我们提供的视频生成模型列表,以及相关基础信息: -| 模型名字 | CogVideoX-2B | -|----------------|--------------------------------------------------------------------------------------------------------------------------------------| -| 提示词语言 | English | -| 推理显存消耗 (FP-16) | 36GB using diffusers (will be optimized before the PR is merged) and 18GB using [SAT](https://github.com/THUDM/SwissArmyTransformer) | -| 微调显存消耗 (bs=1) | 42GB | -| 提示词长度上限 | 226 Tokens | -| 视频长度 | 6 seconds | -| 帧率(每秒) | 8 frames | -| 视频分辨率 | 720 * 480 | -| 量化推理 | 不支持 | -| 多卡推理 | 不支持 | -| 权重地址 | 🤗 [CogVideoX-2B](https://huggingface.co/THUDM/CogVideoX-2B) | +| 模型名字 | CogVideoX-2B | +|---------------------|--------------------------------------------------------------------------------------------------------------------------------------| +| 提示词语言 | English | +| 推理显存消耗 (FP-16) | 36GB using diffusers (will be optimized before the PR is merged) and 18GB using [SAT](https://github.com/THUDM/SwissArmyTransformer) | +| 微调显存消耗 (bs=1) | 42GB | +| 提示词长度上限 | 226 Tokens | +| 视频长度 | 6 seconds | +| 帧率(每秒) | 8 frames | +| 视频分辨率 | 720 * 480 | +| 量化推理 | 不支持 | +| 多卡推理 | 不支持 | +| 下载地址 (Diffusers 模型) | 🤗 [Huggingface](https://huggingface.co/THUDM/CogVideoX-2B) [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/CogVideoX-2b) | +| 下载地址 (SAT 模型) | [SAT](./sat/README_zh.md) | ## 项目结构 diff --git a/requirements.txt b/requirements.txt index c5d9a82..68b2e91 100644 --- a/requirements.txt +++ b/requirements.txt @@ -5,4 +5,5 @@ streamlit>=1.37.0 opencv-python>=4.10 imageio-ffmpeg>=0.5.1 openai>=1.38.0 -transformers>=4.43.3 \ No newline at end of file +transformers>=4.43.3 +accelerate>=0.33.0 \ No newline at end of file diff --git a/sat/README.md b/sat/README.md index b93de31..8d9ac8f 100644 --- a/sat/README.md +++ b/sat/README.md @@ -22,7 +22,7 @@ mkdir CogVideoX-2b-sat cd CogVideoX-2b-sat wget https://cloud.tsinghua.edu.cn/f/fdba7608a49c463ba754/?dl=1 mv 'index.html?dl=1' vae.zip -uzip vae.zip +unzip vae.zip wget https://cloud.tsinghua.edu.cn/f/556a3e1329e74f1bac45/?dl=1 mv 'index.html?dl=1' transformer.zip unzip transformer.zip diff --git a/sat/README_zh.md b/sat/README_zh.md index 6fea5ab..ba301c8 100644 --- a/sat/README_zh.md +++ b/sat/README_zh.md @@ -21,7 +21,7 @@ mkdir CogVideoX-2b-sat cd CogVideoX-2b-sat wget https://cloud.tsinghua.edu.cn/f/fdba7608a49c463ba754/?dl=1 mv 'index.html?dl=1' vae.zip -uzip vae.zip +unzip vae.zip wget https://cloud.tsinghua.edu.cn/f/556a3e1329e74f1bac45/?dl=1 mv 'index.html?dl=1' transformer.zip unzip transformer.zip