显存优化

This commit is contained in:
zR 2024-08-06 03:04:06 +08:00
parent ad855f622c
commit f7721c7fd2
4 changed files with 26 additions and 26 deletions

View File

@ -57,18 +57,18 @@ to [清影](https://chatglm.cn/video).
The table below shows the list of video generation models we currently provide, The table below shows the list of video generation models we currently provide,
along with related basic information: along with related basic information:
| Model Name | CogVideoX-2B | | Model Name | CogVideoX-2B |
|-------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------| |-------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------|
| Prompt Language | English | | Prompt Language | English |
| GPU Memory Required for Inference (FP16) | 36GB using diffusers (will be optimized before the PR is merged) and 25G using [SAT](https://github.com/THUDM/SwissArmyTransformer) | | GPU Memory Required for Inference (FP16) | 36GB using diffusers (will be optimized before the PR is merged) and 18GB using [SAT](https://github.com/THUDM/SwissArmyTransformer) |
| GPU Memory Required for Fine-tuning(bs=1) | 42GB | | GPU Memory Required for Fine-tuning(bs=1) | 42GB |
| Prompt Max Length | 226 Tokens | | Prompt Max Length | 226 Tokens |
| Video Length | 6 seconds | | Video Length | 6 seconds |
| Frames Per Second | 8 frames | | Frames Per Second | 8 frames |
| Resolution | 720 * 480 | | Resolution | 720 * 480 |
| Quantized Inference | Not Supported | | Quantized Inference | Not Supported |
| Multi-card Inference | Not Supported | | Multi-card Inference | Not Supported |
| Download Link | 🤗 [CogVideoX-2B](https://huggingface.co/THUDM/CogVideoX-2B) | | Download Link | 🤗 [CogVideoX-2B](https://huggingface.co/THUDM/CogVideoX-2B) |
## Project Structure ## Project Structure

View File

@ -54,18 +54,18 @@ CogVideoX是 [清影](https://chatglm.cn/video) 同源的开源版本视频生
下表战展示目前我们提供的视频生成模型列表,以及相关基础信息: 下表战展示目前我们提供的视频生成模型列表,以及相关基础信息:
| 模型名字 | CogVideoX-2B | | 模型名字 | CogVideoX-2B |
|----------------|-------------------------------------------------------------------------------------------------------------------------------------| |----------------|--------------------------------------------------------------------------------------------------------------------------------------|
| 提示词语言 | English | | 提示词语言 | English |
| 推理显存消耗 (FP-16) | 36GB using diffusers (will be optimized before the PR is merged) and 25G using [SAT](https://github.com/THUDM/SwissArmyTransformer) | | 推理显存消耗 (FP-16) | 36GB using diffusers (will be optimized before the PR is merged) and 18GB using [SAT](https://github.com/THUDM/SwissArmyTransformer) |
| 微调显存消耗 (bs=1) | 42GB | | 微调显存消耗 (bs=1) | 42GB |
| 提示词长度上限 | 226 Tokens | | 提示词长度上限 | 226 Tokens |
| 视频长度 | 6 seconds | | 视频长度 | 6 seconds |
| 帧率(每秒) | 8 frames | | 帧率(每秒) | 8 frames |
| 视频分辨率 | 720 * 480 | | 视频分辨率 | 720 * 480 |
| 量化推理 | 不支持 | | 量化推理 | 不支持 |
| 多卡推理 | 不支持 | | 多卡推理 | 不支持 |
| 权重地址 | 🤗 [CogVideoX-2B](https://huggingface.co/THUDM/CogVideoX-2B) | | 权重地址 | 🤗 [CogVideoX-2B](https://huggingface.co/THUDM/CogVideoX-2B) |
## 项目结构 ## 项目结构

View File

@ -43,7 +43,7 @@ def generate_video(
# Load the pre-trained CogVideoX pipeline with the specified precision (float16) and move it to the specified device # Load the pre-trained CogVideoX pipeline with the specified precision (float16) and move it to the specified device
pipe = CogVideoXPipeline.from_pretrained(model_path, torch_dtype=dtype).to(device) pipe = CogVideoXPipeline.from_pretrained(model_path, torch_dtype=dtype).to(device)
pipe.enable_sequential_cpu_offload() # Enable sequential CPU offload for faster inference
# Encode the prompt to get the prompt embeddings # Encode the prompt to get the prompt embeddings
prompt_embeds, _ = pipe.encode_prompt( prompt_embeds, _ = pipe.encode_prompt(
prompt=prompt, # The textual description for video generation prompt=prompt, # The textual description for video generation

View File

@ -4,7 +4,7 @@ This script demonstrates how to encode video frames using a pre-trained CogVideo
Note: Note:
This script requires the `diffusers>=0.30.0` library to be installed. This script requires the `diffusers>=0.30.0` library to be installed.
If the video appears completely green and cannot be viewed, please switch to a different player to watch it. This is a normal phenomenon. If the video appears completely green and cannot be viewed, please switch to a different player to watch it. This is a normal phenomenon.
Cost 71GB of GPU memory for encoding a 1-minute video at 720p resolution. Cost 71GB of GPU memory for encoding a 6s video at 720p resolution.
Run the script: Run the script:
$ python cli_demo.py --model_path THUDM/CogVideoX-2b --video_path path/to/video.mp4 --output_path path/to/output $ python cli_demo.py --model_path THUDM/CogVideoX-2b --video_path path/to/video.mp4 --output_path path/to/output