blackteay/GPT-SoVITS

Fork 0

mirror of https://github.com/RVC-Boss/GPT-SoVITS.git synced 2025-10-03 11:59:58 +08:00

Go to file

Jarod Mica 54bcce13d2 more import changes

2024-12-23 01:58:31 -08:00

Docker

明确Dockerfile中所使用的asr模型的版本，使得和cmd-asr.py中保持一致

2024-02-03 14:49:23 +00:00

docs

Fix typo (#1568 )

2024-09-03 11:17:42 +08:00

GPT_SoVITS

more import changes

2024-12-23 01:58:31 -08:00

.gitignore

All in one! 合并main分支和fast_inference_分支 (#1490 )

2024-08-20 22:19:04 +08:00

api.py

more import changes

2024-12-23 01:58:31 -08:00

config.py

Update config.py

2024-02-21 01:15:31 +00:00

infer_script.py

Add token streaming in batches to the TTS class.

2024-12-23 00:11:17 -08:00

LICENSE

Initial commit

2024-01-15 02:05:22 +08:00

MANIFEST.in

turn it into a package

2024-11-16 02:30:20 -08:00

pyproject.toml

cleanup for package

2024-11-16 02:56:29 -08:00

README.md

update name

2024-11-16 02:57:56 -08:00

requirements.txt

turn it into a package

2024-11-16 02:30:20 -08:00

test_streaming.py

cleanup for package

2024-11-16 02:56:29 -08:00

test.py

cleanup for package

2024-11-16 02:56:29 -08:00

README.md

Jarod's NOTE

Working on turning this into a package. Right now, the API does in fact work to make requests to and this can be installed.

Quick Install and Usage

Ideally, do this all inside of a venv for package isolation

Install by doing:

pip install git+https://github.com/JarodMica/GPT-SoVITS-Package.git

Make sure torch is installed with CUDA enabled. Reccomend to run pip uninstall torch to uninstall torch, then reinstall with the following. I chose 2.4.0+cu121:

pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu121

Now to use it, so far I've only tested it with the api_v2.py. Given that the install above went fine, you should now be able to run:

gpt_sovits_api

Which will bootup local server that you can make requests to. Checkout test.py and test_streaming.py to get an idea for how you might be able to use the API.

Pretrained Models

Probably don't need to follow the instructions for the below, these are just kept here for reference for now.

Download pretrained models from GPT-SoVITS Models and place them in GPT_SoVITS/pretrained_models.
Download G2PW models from G2PWModel_1.1.zip, unzip and rename to G2PWModel, and then place them in GPT_SoVITS/text.(Chinese TTS Only)
For UVR5 (Vocals/Accompaniment Separation & Reverberation Removal, additionally), download models from UVR5 Weights and place them in tools/uvr5/uvr5_weights.
For Chinese ASR (additionally), download models from Damo ASR Model, Damo VAD Model, and Damo Punc Model and place them in tools/asr/models.
For English or Japanese ASR (additionally), download models from Faster Whisper Large V3 and place them in tools/asr/models. Also, other models may have the similar effect with smaller disk footprint.

Credits

Special thanks to the RVC-Boss for getting this wonderful tool up and going, as well as all of the other attributions used to build it:

Original Repo: https://github.com/RVC-Boss/GPT-SoVITS

Languages

Python 96.6%

Shell 0.9%

Jupyter Notebook 0.9%

Cuda 0.6%

PowerShell 0.5%

Other 0.4%