From 8f7e5557be4f6bb3d51976e85bd8c23b789514e1 Mon Sep 17 00:00:00 2001
From: XXXXRT666 <157766680+XXXXRT666@users.noreply.github.com>
Date: Fri, 10 Oct 2025 12:03:02 +0100
Subject: [PATCH] .
---
README.md | 126 +++++++++++++++++++++++++++++----------------
docs/cn/README.md | 109 +++++++++++++++++++++++++++------------
docs/ja/README.md | 125 ++++++++++++++++++++++++++++++---------------
docs/ko/README.md | 127 ++++++++++++++++++++++++++++++----------------
docs/tr/README.md | 127 +++++++++++++++++++++++++++++++---------------
5 files changed, 412 insertions(+), 202 deletions(-)
diff --git a/README.md b/README.md
index 8476db4e..33c442e9 100644
--- a/README.md
+++ b/README.md
@@ -1,3 +1,5 @@
+#
+
GPT-SoVITS-WebUI
@@ -7,8 +9,6 @@ A Powerful Few-shot Voice Conversion and Text-to-Speech WebUI.

-
-
[](https://www.python.org)
[](https://github.com/RVC-Boss/gpt-sovits/releases)
@@ -27,8 +27,12 @@ A Powerful Few-shot Voice Conversion and Text-to-Speech WebUI.
---
+
+
## Features
+
+
1. **Zero-shot TTS:** Input a 5-second vocal sample and experience instant text-to-speech conversion.
2. **Few-shot TTS:** Fine-tune the model with just 1 minute of training data for improved voice similarity and realism.
@@ -43,31 +47,39 @@ Unseen speakers few-shot fine-tuning demo:
+
+
## Infer Speed
-| Device | RTF | TTFB | Batch Size | Backend |
-| ----------- | ----- | ------ | ---------- | --------------------------- |
-| RTX 5090 | 0.05 | 150 ms | 1 | Flash Attn Varlen CUDAGraph |
-| RTX 4090 | 0.014 | UNK | 24 | Flash Attn Varlen CUDAGraph |
-| RTX 4060 Ti | 0.07 | 460 ms | 1 | Flash Attn Varlen CUDAGraph |
-| RTX 4060 Ti | 0.028 | UNK | 28 | Flash Attn Varlen CUDAGraph |
-| Apple M4 | 0.21 | | 1 | MLX Quantized Affined |
+| Device | RTF | TTFB | Batch Size | Backend |
+| :---------: | :---: | :----: | :--------: | :-------------------------: |
+| RTX 5090 | 0.05 | 150 ms | 1 | Flash Attn Varlen CUDAGraph |
+| RTX 4090 | 0.014 | UNK | 24 | Flash Attn Varlen CUDAGraph |
+| RTX 4060 Ti | 0.07 | 460 ms | 1 | Flash Attn Varlen CUDAGraph |
+| RTX 4060 Ti | 0.028 | UNK | 28 | Flash Attn Varlen CUDAGraph |
+| Apple M4 | 0.21 | UNK | 1 | MLX Quantized Affined |
+
+
**User guide: [简体中文](https://www.yuque.com/baicaigongchang1145haoyuangong/ib3g1e) | [English](https://rentry.co/GPT-SoVITS-guide#/)**
+
+
## Installation
-For users in China, you can [click here](https://www.codewithgpu.com/i/RVC-Boss/GPT-SoVITS/GPT-SoVITS-Official) to use AutoDL Cloud Docker to experience the full functionality online.
+For users in China, you can [Click Here to use AutoDL](https://www.codewithgpu.com/i/RVC-Boss/GPT-SoVITS/GPT-SoVITS-Official) Cloud Docker to experience the full functionality online.
### Tested Environments
-| Python Version | PyTorch Version | Device |
-| -------------- | --------------- | ------------- |
-| Python 3.10 | PyTorch 2.5.1 | CUDA 12.4 |
-| Python 3.11 | PyTorch 2.5.1 | CUDA 12.4 |
-| Python 3.11 | PyTorch 2.7.0 | CUDA 12.8 |
-| Python 3.11 | PyTorch 2.8.0 | Apple Silicon |
-| Python 3.10 | PyTorch 2.8.0 | CPU |
+| Python Version | PyTorch Version | Device |
+| :------------: | :-------------: | :-----------: |
+| Python 3.10 | PyTorch 2.5.1 | CUDA 12.4 |
+| Python 3.11 | PyTorch 2.5.1 | CUDA 12.4 |
+| Python 3.11 | PyTorch 2.7.0 | CUDA 12.8 |
+| Python 3.11 | PyTorch 2.8.0 | Apple Silicon |
+| Python 3.10 | PyTorch 2.8.0 | CPU |
+
+
### Windows
@@ -103,8 +115,12 @@ conda activate GPTSoVits
bash install.sh --device --source [--download-uvr5]
```
+
+
### Install Manually
+
+
#### Install Dependences
```bash
@@ -143,8 +159,12 @@ Install [Visual Studio 2017](https://aka.ms/vs/17/release/vc_redist.x86.exe)
brew install ffmpeg
```
+
+
### Running GPT-SoVITS with Docker
+
+
#### Docker Image Selection
Due to rapid development in the codebase and a slower Docker image release cycle, please:
@@ -193,8 +213,12 @@ Once the container is running in the background, you can access it using:
docker exec -it bash
```
+
+
## Pretrained Models
+
+
**If `install.sh` runs successfully, you may skip No.1,2,3**
**Users in China can [download all these models here](https://www.yuque.com/baicaigongchang1145haoyuangong/ib3g1e/dkxgpiy9zb96hob4#nVNhX).**
@@ -213,8 +237,12 @@ docker exec -it
+
## Dataset Format
+
+
The TTS annotation .list file format:
```text
@@ -239,10 +267,14 @@ D:\GPT-SoVITS\xxx/xxx.wav|xxx|en|I like playing Genshin.
```
+