mirror of
https://github.com/THUDM/CogVideo.git
synced 2025-04-05 19:41:59 +08:00
显存优化
This commit is contained in:
parent
ad855f622c
commit
f7721c7fd2
24
README.md
24
README.md
@ -57,18 +57,18 @@ to [清影](https://chatglm.cn/video).
|
||||
The table below shows the list of video generation models we currently provide,
|
||||
along with related basic information:
|
||||
|
||||
| Model Name | CogVideoX-2B |
|
||||
|-------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| Prompt Language | English |
|
||||
| GPU Memory Required for Inference (FP16) | 36GB using diffusers (will be optimized before the PR is merged) and 25G using [SAT](https://github.com/THUDM/SwissArmyTransformer) |
|
||||
| GPU Memory Required for Fine-tuning(bs=1) | 42GB |
|
||||
| Prompt Max Length | 226 Tokens |
|
||||
| Video Length | 6 seconds |
|
||||
| Frames Per Second | 8 frames |
|
||||
| Resolution | 720 * 480 |
|
||||
| Quantized Inference | Not Supported |
|
||||
| Multi-card Inference | Not Supported |
|
||||
| Download Link | 🤗 [CogVideoX-2B](https://huggingface.co/THUDM/CogVideoX-2B) |
|
||||
| Model Name | CogVideoX-2B |
|
||||
|-------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| Prompt Language | English |
|
||||
| GPU Memory Required for Inference (FP16) | 36GB using diffusers (will be optimized before the PR is merged) and 18GB using [SAT](https://github.com/THUDM/SwissArmyTransformer) |
|
||||
| GPU Memory Required for Fine-tuning(bs=1) | 42GB |
|
||||
| Prompt Max Length | 226 Tokens |
|
||||
| Video Length | 6 seconds |
|
||||
| Frames Per Second | 8 frames |
|
||||
| Resolution | 720 * 480 |
|
||||
| Quantized Inference | Not Supported |
|
||||
| Multi-card Inference | Not Supported |
|
||||
| Download Link | 🤗 [CogVideoX-2B](https://huggingface.co/THUDM/CogVideoX-2B) |
|
||||
|
||||
## Project Structure
|
||||
|
||||
|
24
README_zh.md
24
README_zh.md
@ -54,18 +54,18 @@ CogVideoX是 [清影](https://chatglm.cn/video) 同源的开源版本视频生
|
||||
|
||||
下表战展示目前我们提供的视频生成模型列表,以及相关基础信息:
|
||||
|
||||
| 模型名字 | CogVideoX-2B |
|
||||
|----------------|-------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| 提示词语言 | English |
|
||||
| 推理显存消耗 (FP-16) | 36GB using diffusers (will be optimized before the PR is merged) and 25G using [SAT](https://github.com/THUDM/SwissArmyTransformer) |
|
||||
| 微调显存消耗 (bs=1) | 42GB |
|
||||
| 提示词长度上限 | 226 Tokens |
|
||||
| 视频长度 | 6 seconds |
|
||||
| 帧率(每秒) | 8 frames |
|
||||
| 视频分辨率 | 720 * 480 |
|
||||
| 量化推理 | 不支持 |
|
||||
| 多卡推理 | 不支持 |
|
||||
| 权重地址 | 🤗 [CogVideoX-2B](https://huggingface.co/THUDM/CogVideoX-2B) |
|
||||
| 模型名字 | CogVideoX-2B |
|
||||
|----------------|--------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| 提示词语言 | English |
|
||||
| 推理显存消耗 (FP-16) | 36GB using diffusers (will be optimized before the PR is merged) and 18GB using [SAT](https://github.com/THUDM/SwissArmyTransformer) |
|
||||
| 微调显存消耗 (bs=1) | 42GB |
|
||||
| 提示词长度上限 | 226 Tokens |
|
||||
| 视频长度 | 6 seconds |
|
||||
| 帧率(每秒) | 8 frames |
|
||||
| 视频分辨率 | 720 * 480 |
|
||||
| 量化推理 | 不支持 |
|
||||
| 多卡推理 | 不支持 |
|
||||
| 权重地址 | 🤗 [CogVideoX-2B](https://huggingface.co/THUDM/CogVideoX-2B) |
|
||||
|
||||
## 项目结构
|
||||
|
||||
|
@ -43,7 +43,7 @@ def generate_video(
|
||||
|
||||
# Load the pre-trained CogVideoX pipeline with the specified precision (float16) and move it to the specified device
|
||||
pipe = CogVideoXPipeline.from_pretrained(model_path, torch_dtype=dtype).to(device)
|
||||
|
||||
pipe.enable_sequential_cpu_offload() # Enable sequential CPU offload for faster inference
|
||||
# Encode the prompt to get the prompt embeddings
|
||||
prompt_embeds, _ = pipe.encode_prompt(
|
||||
prompt=prompt, # The textual description for video generation
|
||||
|
@ -4,7 +4,7 @@ This script demonstrates how to encode video frames using a pre-trained CogVideo
|
||||
Note:
|
||||
This script requires the `diffusers>=0.30.0` library to be installed.
|
||||
If the video appears “completely green” and cannot be viewed, please switch to a different player to watch it. This is a normal phenomenon.
|
||||
Cost 71GB of GPU memory for encoding a 1-minute video at 720p resolution.
|
||||
Cost 71GB of GPU memory for encoding a 6s video at 720p resolution.
|
||||
|
||||
Run the script:
|
||||
$ python cli_demo.py --model_path THUDM/CogVideoX-2b --video_path path/to/video.mp4 --output_path path/to/output
|
||||
|
Loading…
x
Reference in New Issue
Block a user