mirror of
https://github.com/THUDM/CogVideo.git
synced 2025-04-06 03:57:56 +08:00
update community framework
This commit is contained in:
parent
671007ac13
commit
00054de87f
21
README.md
21
README.md
@ -20,9 +20,11 @@
|
||||
|
||||
## Update and News
|
||||
|
||||
- 🔥 **News**: ```2024/8/7```: CogVideoX has been integrated into `diffusers` version 0.30.0. Inference can now be performed
|
||||
- 🔥 **News**: ```2024/8/7```: CogVideoX has been integrated into `diffusers` version 0.30.0. Inference can now be
|
||||
performed
|
||||
on a single 3090 GPU. For more details, please refer to the [code](inference/cli_demo.py).
|
||||
- 🔥 **News**: ```2024/8/6```: We have also open-sourced **3D Causal VAE** used in **CogVideoX-2B**, which can reconstruct
|
||||
- 🔥 **News**: ```2024/8/6```: We have also open-sourced **3D Causal VAE** used in **CogVideoX-2B**, which can
|
||||
reconstruct
|
||||
the video almost losslessly.
|
||||
- 🔥 **News**: ```2024/8/6```: We have open-sourced **CogVideoX-2B**,the first model in the CogVideoX series of video
|
||||
generation models.
|
||||
@ -54,9 +56,9 @@ Jump to a specific section:
|
||||
|
||||
### Prompt Optimization
|
||||
|
||||
Before running the model, please refer to [this guide](inference/convert_demo.py) to see how we use the GLM-4 model to
|
||||
optimize the prompt. This is crucial because the model is trained with long prompts, and a good prompt directly affects
|
||||
the quality of the generated video.
|
||||
Before running the model, please refer to [this guide](inference/convert_demo.py) to see how we use large models like
|
||||
GLM-4 (or other comparable products, such as GPT-4) to optimize the model. This is crucial because the model is trained
|
||||
with long prompts, and a good prompt directly impacts the quality of the video generation.
|
||||
|
||||
### SAT
|
||||
|
||||
@ -123,6 +125,15 @@ along with related basic information:
|
||||
| Download Link (HF diffusers Model) | 🤗 [Huggingface](https://huggingface.co/THUDM/CogVideoX-2B) [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/CogVideoX-2b) [💫 WiseModel](https://wisemodel.cn/models/ZhipuAI/CogVideoX-2b) |
|
||||
| Download Link (SAT Model) | [SAT](./sat/README.md) |
|
||||
|
||||
## Friendly Links
|
||||
|
||||
We highly welcome contributions from the community and actively contribute to the open-source community. The following
|
||||
works have already been adapted for CogVideoX, and we invite everyone to use them:
|
||||
|
||||
+ [Xorbits Inference](https://github.com/xorbitsai/inference): A powerful and comprehensive distributed inference
|
||||
framework, allowing you to easily deploy your own models or the latest cutting-edge open-source models with just one
|
||||
click.
|
||||
|
||||
## Project Structure
|
||||
|
||||
This open-source repository will guide developers to quickly get started with the basic usage and fine-tuning examples
|
||||
|
37
README_zh.md
37
README_zh.md
@ -21,7 +21,8 @@
|
||||
|
||||
## 项目更新
|
||||
|
||||
- 🔥 **News**: ```2024/8/7```: CogVideoX 已经合并入 `diffusers` 0.30.0版本,单张3090可以推理,详情请见[代码](inference/cli_demo.py)。
|
||||
- 🔥 **News**: ```2024/8/7```: CogVideoX 已经合并入 `diffusers`
|
||||
0.30.0版本,单张3090可以推理,详情请见[代码](inference/cli_demo.py)。
|
||||
- 🔥 **News**: ```2024/8/6```: 我们开源 **3D Causal VAE**,用于 **CogVideoX-2B**,可以几乎无损地重构视频。
|
||||
- 🔥 **News**: ```2024/8/6```: 我们开源 CogVideoX 系列视频生成模型的第一个模型, **CogVideoX-2B**。
|
||||
- 🌱 **Source**: ```2022/5/19```: 我们开源了 CogVideo 视频生成模型(现在你可以在 `CogVideo` 分支中看到),这是首个开源的基于
|
||||
@ -50,8 +51,8 @@
|
||||
|
||||
### 提示词优化
|
||||
|
||||
在开始运行模型之前,请参考[这里](inference/convert_demo.py) 查看我们是怎么使用GLM-4大模型对模型进行优化的,这很重要,
|
||||
由于模型是在长提示词下训练的,一额好的直接影响了视频生成的质量。
|
||||
在开始运行模型之前,请参考[这里](inference/convert_demo.py) 查看我们是怎么使用GLM-4(或者同级别的其他产品,例如GPT-4)大模型对模型进行优化的,这很重要,
|
||||
由于模型是在长提示词下训练的,一个好的提示词直接影响了视频生成的质量。
|
||||
|
||||
### SAT
|
||||
|
||||
@ -95,19 +96,25 @@ CogVideoX是 [清影](https://chatglm.cn/video?fr=osm_cogvideox) 同源的开源
|
||||
|
||||
下表展示目前我们提供的视频生成模型列表,以及相关基础信息:
|
||||
|
||||
| 模型名 | CogVideoX-2B |
|
||||
|---------------------|-------------------------------------------------------------------------------------------------------------------------------|
|
||||
| 提示词语言 | English |
|
||||
| 单GPU推理 (FP-16) 显存消耗 | 18GB using [SAT](https://github.com/THUDM/SwissArmyTransformer) <br> 23.9GB using diffusers |
|
||||
| 多GPU推理 (FP-16) 显存消耗 | 20GB minimum per GPU using diffusers |
|
||||
| 微调显存消耗 (bs=1) | 42GB |
|
||||
| 提示词长度上限 | 226 Tokens |
|
||||
| 视频长度 | 6 seconds |
|
||||
| 帧率(每秒) | 8 frames |
|
||||
| 视频分辨率 | 720 * 480 |
|
||||
| 量化推理 | 不支持 |
|
||||
| 模型名 | CogVideoX-2B |
|
||||
|---------------------|---------------------------------------------------------------------------------------------------------------------------------|
|
||||
| 提示词语言 | English |
|
||||
| 单GPU推理 (FP-16) 显存消耗 | 18GB using [SAT](https://github.com/THUDM/SwissArmyTransformer) <br> 23.9GB using diffusers |
|
||||
| 多GPU推理 (FP-16) 显存消耗 | 20GB minimum per GPU using diffusers |
|
||||
| 微调显存消耗 (bs=1) | 42GB |
|
||||
| 提示词长度上限 | 226 Tokens |
|
||||
| 视频长度 | 6 seconds |
|
||||
| 帧率(每秒) | 8 frames |
|
||||
| 视频分辨率 | 720 * 480 |
|
||||
| 量化推理 | 不支持 |
|
||||
| 下载地址 (Diffusers 模型) | 🤗 [Huggingface](https://huggingface.co/THUDM/CogVideoX-2B) [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/CogVideoX-2b) |
|
||||
| 下载地址 (SAT 模型) | [SAT](./sat/README_zh.md) |
|
||||
| 下载地址 (SAT 模型) | [SAT](./sat/README_zh.md) |
|
||||
|
||||
## 友情链接
|
||||
|
||||
我们非常欢迎来自社区的贡献,并积极的贡献开源社区。以下作品已经对CogVideoX进行了适配,欢迎大家使用:
|
||||
|
||||
+ [Xorbits Inference](https://github.com/xorbitsai/inference): 性能强大且功能全面的分布式推理框架,轻松一键部署你自己的模型或内置的前沿开源模型。
|
||||
|
||||
## 完整项目代码结构
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user