Updated README, README_zh, and gradio_demo

This commit is contained in:
Yuvraj Sharma 2024-08-06 18:14:29 +05:30
parent 8e8275d2e8
commit 51b7f53e7a
4 changed files with 12 additions and 2 deletions

View File

@ -82,6 +82,12 @@ of the **CogVideoX** open-source model.
+ [cli_demo](inference/cli_demo.py): A more detailed explanation of the inference code, mentioning the significance of common parameters.
+ [cli_vae_demo](inference/cli_vae_demo.py): Executing the VAE inference code alone currently requires 71GB of memory, but it will be optimized in the future.
+ [convert_demo](inference/convert_demo.py): How to convert user input into a format suitable for CogVideoX. Because CogVideoX is trained on long caption, we need to convert the input text to be consistent with the training distribution using a LLM. By default, the script uses GLM4, but it can also be replaced with any other LLM such as GPT, Gemini, etc.
+ [gradio_demo](gradio_demo.py): A simple gradio web UI demonstrating how to use the CogVideoX-2B model to generate videos.
<div style="text-align: center;">
<img src="resources/gradio_demo.png" style="width: 100%; height: auto;" />
</div>
+ [web_demo](inference/web_demo.py): A simple streamlit web application demonstrating how to use the CogVideoX-2B model to generate videos.
<div style="text-align: center;">

View File

@ -77,6 +77,12 @@ CogVideoX是 [清影](https://chatglm.cn/video?fr=osm_cogvideox) 同源的开源
+ [cli_demo](inference/cli_demo.py): 更详细的推理代码讲解,常见参数的意义,在这里都会提及。
+ [cli_vae_demo](inference/cli_vae_demo.py): 单独执行VAE的推理代码目前需要71GB显存将来会优化。
+ [convert_demo](inference/convert_demo.py): 如何将用户的输入转换成适合 CogVideoX的长输入。因为CogVideoX是在长文本上训练的所以我们需要把输入文本的分布通过LLM转换为和训练一致的长文本。脚本中默认使用GLM4也可以替换为GPT、Gemini等任意大语言模型。
+ [gradio_demo](gradio_demo.py): 一个简单的gradio网页应用展示如何使用 CogVideoX-2B 模型生成视频。
<div style="text-align: center;">
<img src="resources/gradio_demo.png" style="width: 100%; height: auto;" />
</div>
+ [web_demo](inference/web_demo.py): 一个简单的streamlit网页应用展示如何使用 CogVideoX-2B 模型生成视频。
<div style="text-align: center;">

View File

@ -9,7 +9,6 @@ import torch
from diffusers import CogVideoXPipeline
from datetime import datetime, timedelta
from openai import OpenAI
import spaces
import imageio
import moviepy.editor as mp
from typing import List, Union
@ -88,7 +87,6 @@ def convert_prompt(prompt: str, retry_times: int = 3) -> str:
return prompt
@spaces.GPU(duration=240)
def infer(
prompt: str,
num_inference_steps: int,

BIN
resources/gradio_demo.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 597 KiB