pip install torch numpy scipy tensorboard librosa==0.9.2 numba==0.56.4 pytorch-lightning gradio==3.14.0 ffmpeg-python onnxruntime tqdm==4.59.0 cn2an pypinyin pyopenjtalk g2p_en

additionally

If you need the Chinese ASR feature supported by funasr, you should

pip install modelscope torchaudio sentencepiece funasr

You need ffmpeg.

Ubuntu/Debian users

sudo apt install ffmpeg

MacOS users

brew install ffmpeg

Windows users

download and put them in the GPT-SoVITS root.

download ffmpeg.exe
download ffprobe.exe

You need download some pretrained models

pretrained GPT-SoVITS models/SSL feature model/Chinese BERT model

put these files

https://huggingface.co/lj1995/GPT-SoVITS

GPT_SoVITS\pretrained_models

Chinese ASR (Additionally)

put these files

https://modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/files

https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/files

https://modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/files

tools/damo_asr/models

UVR5 (Vocals/Accompaniment Separation & Reverberation Removal. Additionally)

put the models you need from

https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/uvr5_weights

tools/uvr5/uvr5_weights

Credits

todo

Languages

Python 97.1%

Shell 0.9%

Cuda 0.6%

PowerShell 0.5%

Jupyter Notebook 0.4%

Other 0.4%