diff --git a/README.md b/README.md index 3db483e..d5790fd 100644 --- a/README.md +++ b/README.md @@ -4,11 +4,13 @@
- 🤗 Experience on CogVideoX Huggingface Space
- ++📚 Check here to view Paper +
@@ -55,18 +57,18 @@ to [清影](https://chatglm.cn/video). The table below shows the list of video generation models we currently provide, along with related basic information: -| Model Name | CogVideoX-2B | -|-------------------------------------------|--------------------------------------------------------------| -| Prompt Language | English | -| GPU Memory Required for Inference (FP16) | 36GB (will be optimized before the PR is merged) | -| GPU Memory Required for Fine-tuning(bs=1) | 46.2GB | -| Prompt Max Length | 226 Tokens | -| Video Length | 6 seconds | -| Frames Per Second | 8 frames | -| Resolution | 720 * 480 | -| Quantized Inference | Not Supported | -| Multi-card Inference | Not Supported | -| Download Link | 🤗 [CogVideoX-2B](https://huggingface.co/THUDM/CogVideoX-2B) | +| Model Name | CogVideoX-2B | +|-------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------| +| Prompt Language | English | +| GPU Memory Required for Inference (FP16) | 36GB using diffusers (will be optimized before the PR is merged) and 25G using [SAT](https://github.com/THUDM/SwissArmyTransformer) | +| GPU Memory Required for Fine-tuning(bs=1) | 42GB | +| Prompt Max Length | 226 Tokens | +| Video Length | 6 seconds | +| Frames Per Second | 8 frames | +| Resolution | 720 * 480 | +| Quantized Inference | Not Supported | +| Multi-card Inference | Not Supported | +| Download Link | 🤗 [CogVideoX-2B](https://huggingface.co/THUDM/CogVideoX-2B) | ## Project Structure @@ -89,7 +91,7 @@ of the **CogVideoX** open-source model. ### sat -+ [sat_demo](sat/configs/README_zh.md): Contains the inference code and fine-tuning code of SAT weights. It is ++ [sat_demo](sat/README.md): Contains the inference code and fine-tuning code of SAT weights. It is recommended to improve based on the CogVideoX model structure. Innovative researchers use this code to better perform rapid stacking and development. diff --git a/README_zh.md b/README_zh.md index b831df2..e31ce5c 100644 --- a/README_zh.md +++ b/README_zh.md @@ -5,11 +5,13 @@- 🤗 在 CogVideoX Huggingface Space 体验视频生成模型
- ++📚 查看 论文 +
@@ -52,18 +54,18 @@ CogVideoX是 [清影](https://chatglm.cn/video) 同源的开源版本视频生 下表战展示目前我们提供的视频生成模型列表,以及相关基础信息: -| 模型名字 | CogVideoX-2B | -|----------------|--------------------------------------------------------------| -| 提示词语言 | English | -| 推理显存消耗 (FP-16) | 36GB | -| 微调显存消耗 (bs=1) | 46.2GB | -| 提示词长度上限 | 226 Tokens | -| 视频长度 | 6 seconds | -| 帧率(每秒) | 8 frames | -| 视频分辨率 | 720 * 480 | -| 量化推理 | 不支持 | -| 多卡推理 | 不支持 | -| 权重地址 | 🤗 [CogVideoX-2B](https://huggingface.co/THUDM/CogVideoX-2B) | +| 模型名字 | CogVideoX-2B | +|----------------|-------------------------------------------------------------------------------------------------------------------------------------| +| 提示词语言 | English | +| 推理显存消耗 (FP-16) | 36GB using diffusers (will be optimized before the PR is merged) and 25G using [SAT](https://github.com/THUDM/SwissArmyTransformer) | +| 微调显存消耗 (bs=1) | 42GB | +| 提示词长度上限 | 226 Tokens | +| 视频长度 | 6 seconds | +| 帧率(每秒) | 8 frames | +| 视频分辨率 | 720 * 480 | +| 量化推理 | 不支持 | +| 多卡推理 | 不支持 | +| 权重地址 | 🤗 [CogVideoX-2B](https://huggingface.co/THUDM/CogVideoX-2B) | ## 项目结构 @@ -77,12 +79,12 @@ CogVideoX是 [清影](https://chatglm.cn/video) 同源的开源版本视频生 + [web_demo](inference/web_demo.py): 一个简单的streamlit网页应用,展示如何使用 CogVideoX-2B 模型生成视频。