update community framework

This commit is contained in:
zR 2024-08-09 19:48:47 +08:00
parent 671007ac13
commit 00054de87f
2 changed files with 38 additions and 20 deletions

View File

@ -20,9 +20,11 @@
## Update and News
- 🔥 **News**: ```2024/8/7```: CogVideoX has been integrated into `diffusers` version 0.30.0. Inference can now be performed
- 🔥 **News**: ```2024/8/7```: CogVideoX has been integrated into `diffusers` version 0.30.0. Inference can now be
performed
on a single 3090 GPU. For more details, please refer to the [code](inference/cli_demo.py).
- 🔥 **News**: ```2024/8/6```: We have also open-sourced **3D Causal VAE** used in **CogVideoX-2B**, which can reconstruct
- 🔥 **News**: ```2024/8/6```: We have also open-sourced **3D Causal VAE** used in **CogVideoX-2B**, which can
reconstruct
the video almost losslessly.
- 🔥 **News**: ```2024/8/6```: We have open-sourced **CogVideoX-2B**the first model in the CogVideoX series of video
generation models.
@ -54,9 +56,9 @@ Jump to a specific section:
### Prompt Optimization
Before running the model, please refer to [this guide](inference/convert_demo.py) to see how we use the GLM-4 model to
optimize the prompt. This is crucial because the model is trained with long prompts, and a good prompt directly affects
the quality of the generated video.
Before running the model, please refer to [this guide](inference/convert_demo.py) to see how we use large models like
GLM-4 (or other comparable products, such as GPT-4) to optimize the model. This is crucial because the model is trained
with long prompts, and a good prompt directly impacts the quality of the video generation.
### SAT
@ -123,6 +125,15 @@ along with related basic information:
| Download Link (HF diffusers Model) | 🤗 [Huggingface](https://huggingface.co/THUDM/CogVideoX-2B) [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/CogVideoX-2b) [💫 WiseModel](https://wisemodel.cn/models/ZhipuAI/CogVideoX-2b) |
| Download Link (SAT Model) | [SAT](./sat/README.md) |
## Friendly Links
We highly welcome contributions from the community and actively contribute to the open-source community. The following
works have already been adapted for CogVideoX, and we invite everyone to use them:
+ [Xorbits Inference](https://github.com/xorbitsai/inference): A powerful and comprehensive distributed inference
framework, allowing you to easily deploy your own models or the latest cutting-edge open-source models with just one
click.
## Project Structure
This open-source repository will guide developers to quickly get started with the basic usage and fine-tuning examples

View File

@ -21,7 +21,8 @@
## 项目更新
- 🔥 **News**: ```2024/8/7```: CogVideoX 已经合并入 `diffusers` 0.30.0版本单张3090可以推理详情请见[代码](inference/cli_demo.py)。
- 🔥 **News**: ```2024/8/7```: CogVideoX 已经合并入 `diffusers`
0.30.0版本单张3090可以推理详情请见[代码](inference/cli_demo.py)。
- 🔥 **News**: ```2024/8/6```: 我们开源 **3D Causal VAE**,用于 **CogVideoX-2B**,可以几乎无损地重构视频。
- 🔥 **News**: ```2024/8/6```: 我们开源 CogVideoX 系列视频生成模型的第一个模型, **CogVideoX-2B**
- 🌱 **Source**: ```2022/5/19```: 我们开源了 CogVideo 视频生成模型(现在你可以在 `CogVideo` 分支中看到),这是首个开源的基于
@ -50,8 +51,8 @@
### 提示词优化
在开始运行模型之前,请参考[这里](inference/convert_demo.py) 查看我们是怎么使用GLM-4大模型对模型进行优化的这很重要
由于模型是在长提示词下训练的,一额好的直接影响了视频生成的质量。
在开始运行模型之前,请参考[这里](inference/convert_demo.py) 查看我们是怎么使用GLM-4(或者同级别的其他产品例如GPT-4)大模型对模型进行优化的,这很重要,
由于模型是在长提示词下训练的,一个好的提示词直接影响了视频生成的质量。
### SAT
@ -95,19 +96,25 @@ CogVideoX是 [清影](https://chatglm.cn/video?fr=osm_cogvideox) 同源的开源
下表展示目前我们提供的视频生成模型列表,以及相关基础信息:
| 模型名 | CogVideoX-2B |
|---------------------|-------------------------------------------------------------------------------------------------------------------------------|
| 提示词语言 | English |
| 单GPU推理 (FP-16) 显存消耗 | 18GB using [SAT](https://github.com/THUDM/SwissArmyTransformer) <br> 23.9GB using diffusers |
| 多GPU推理 (FP-16) 显存消耗 | 20GB minimum per GPU using diffusers |
| 微调显存消耗 (bs=1) | 42GB |
| 提示词长度上限 | 226 Tokens |
| 视频长度 | 6 seconds |
| 帧率(每秒) | 8 frames |
| 视频分辨率 | 720 * 480 |
| 量化推理 | 不支持 |
| 模型名 | CogVideoX-2B |
|---------------------|---------------------------------------------------------------------------------------------------------------------------------|
| 提示词语言 | English |
| 单GPU推理 (FP-16) 显存消耗 | 18GB using [SAT](https://github.com/THUDM/SwissArmyTransformer) <br> 23.9GB using diffusers |
| 多GPU推理 (FP-16) 显存消耗 | 20GB minimum per GPU using diffusers |
| 微调显存消耗 (bs=1) | 42GB |
| 提示词长度上限 | 226 Tokens |
| 视频长度 | 6 seconds |
| 帧率(每秒) | 8 frames |
| 视频分辨率 | 720 * 480 |
| 量化推理 | 不支持 |
| 下载地址 (Diffusers 模型) | 🤗 [Huggingface](https://huggingface.co/THUDM/CogVideoX-2B) [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/CogVideoX-2b) |
| 下载地址 (SAT 模型) | [SAT](./sat/README_zh.md) |
| 下载地址 (SAT 模型) | [SAT](./sat/README_zh.md) |
## 友情链接
我们非常欢迎来自社区的贡献并积极的贡献开源社区。以下作品已经对CogVideoX进行了适配欢迎大家使用:
+ [Xorbits Inference](https://github.com/xorbitsai/inference): 性能强大且功能全面的分布式推理框架,轻松一键部署你自己的模型或内置的前沿开源模型。
## 完整项目代码结构