docs: update READMEs with auto first-frame extraction feature

2026-01-09 22:57:04 +08:00 · 2025-01-07 06:45:10 +00:00 · 2025-01-07 06:45:10 +00:00 · ee1f666206
commit ee1f666206
parent e084a4a270
3 changed files with 14 additions and 22 deletions
--- a/finetune/README.md
+++ b/finetune/README.md
@ -1,6 +1,6 @@
 # CogVideoX Diffusers Fine-tuning Guide
-[Read this in Chinese](./README_zh.md)
+[中文阅读](./README_zh.md)
 [日本語で読む](./README_ja.md)
@ -25,21 +25,19 @@ First, you need to prepare your dataset. Depending on your task type (T2V or I2V
 ├── prompts.txt
 ├── videos
 ├── videos.txt
-├── images # Only for I2V tasks
+├── images     # (Optional) For I2V, if not provided, first frame will be extracted from video as reference
-└── images.txt # Only for I2V tasks
+└── images.txt # (Optional) For I2V, if not provided, first frame will be extracted from video as reference
 ```
 Where:
 - `prompts.txt`: Contains the prompts
 - `videos/`: Contains the .mp4 video files
 - `videos.txt`: Contains the list of video files in the `videos/` directory
- `images/`: Contains the .png reference image files (only for I2V tasks)
+- `images/`: (Optional) Contains the .png reference image files
- `images.txt`: Contains the list of reference image files (only for I2V tasks)
+- `images.txt`: (Optional) Contains the list of reference image files
 You can download a sample dataset (T2V) [Disney Steamboat Willie](https://huggingface.co/datasets/Wild-Heart/Disney-VideoGeneration-Dataset).
 > We provide a script to extract the first frame of a video as an image [here](./scripts/extract_images.py). You can use this script to generate reference images for I2V tasks.
 If you need to use a validation dataset during training, make sure to provide a validation dataset with the same format as the training dataset.
 ## Run the Script to Start Fine-tuning
--- a/finetune/README_ja.md
+++ b/finetune/README_ja.md
@ -1,6 +1,6 @@
 # CogVideoX Diffusers ファインチューニングガイド
-[中国語で読む](./README_zh.md)
+[中文阅读](./README_zh.md)
 [Read in English](./README.md)
@ -25,20 +25,16 @@ pip install -e .
 ├── prompts.txt
 ├── videos
 ├── videos.txt
-├── images # I2Vタスクの場合のみ
+├── images     # (オプション) I2Vの場合。提供されない場合、動画の最初のフレームが参照画像として使用されます
-└── images.txt # I2Vタスクの場合のみ
+└── images.txt # (オプション) I2Vの場合。提供されない場合、動画の最初のフレームが参照画像として使用されます
 ```
 各ファイルの役割は以下の通りです：
 - `prompts.txt`: プロンプトを格納
 - `videos/`: .mp4 動画ファイルを格納
 - `videos.txt`: `videos/` フォルダ内の動画ファイルリストを格納
- `images/`: .png 形式の参照画像ファイル（I2Vタスクの場合のみ）
+- `images/`: (オプション) .png 形式の参照画像ファイル
- `images.txt`: 参照画像ファイルリスト（I2Vタスクの場合のみ）
+- `images.txt`: (オプション) 参照画像ファイルリスト
 サンプルデータセット（T2V）として、[ディズニー スチームボート・ウィリー](https://huggingface.co/datasets/Wild-Heart/Disney-VideoGeneration-Dataset)をダウンロードできます。
 > 動画の最初のフレームを画像として抽出するスクリプトは[こちら](./scripts/extract_images.py)で提供しています。I2Vタスクの場合、このスクリプトを使用して参照画像を生成できます。
 トレーニング中に検証データセットを使用する場合は、トレーニングデータセットと同じフォーマットで検証データセットを提供する必要があります。
--- a/finetune/README_zh.md
+++ b/finetune/README_zh.md
@ -25,21 +25,19 @@ pip install -e .
 ├── prompts.txt
 ├── videos
 ├── videos.txt
-├── images # 仅 I2V 需要
+├── images     # (可选) 对于I2V，若不提供，则从视频中提取第一帧作为参考图像
-└── images.txt # 仅 I2V 需要
+└── images.txt # (可选) 对于I2V，若不提供，则从视频中提取第一帧作为参考图像
 ```
 其中：
 - `prompts.txt`: 存放提示词
 - `videos/`: 存放.mp4视频文件
 - `videos.txt`: 存放 videos 目录中的视频文件列表
- `images/`: 存放.png参考图像文件
+- `images/`: (可选) 存放.png参考图像文件
- `images.txt`: 存放参考图像文件列表
+- `images.txt`: (可选) 存放参考图像文件列表
 你可以从这里下载示例数据集(T2V) [迪士尼汽船威利号](https://huggingface.co/datasets/Wild-Heart/Disney-VideoGeneration-Dataset)
 > 我们在[这里](./scripts/extract_images.py)提供了提取视频第一帧为图片的脚本，对于I2V任务您可以使用它来生成参考图像。
 如果需要在训练过程中进行validation，则需要额外提供验证数据集，其中数据格式与训练集相同。
 ## 运行脚本，开始微调