Update README.md

This commit is contained in:
RVC-Boss 2024-01-16 22:29:50 +08:00 committed by GitHub
parent 0e2467ace4
commit 23b5979889
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -4,11 +4,31 @@ I am organizing and uploading the codes. It will be public in one day.
https://www.bilibili.com/video/BV12g4y1m7Uw/
todo
features:
1、input 5s vocal, zero shot TTS
2、1min training dataset, fine tune (few shot TTS. The TTS model trained using few-shot techniques exhibits significantly better similarity and realism in the speaker's voice compared to zero-shot.)
3、Cross lingual (inference another language that is different from the training dataset language), now support English, Japanese and Chinese
4、This WebUI integrates tools such as voice accompaniment separation, automatic segmentation of training sets, Chinese ASR, text labeling, etc., to help beginners quickly create their own training datasets and GPT/SoVITS models.
# todolist
todo
1、zero shot voice conversion(5s) /few shot voice converion(1min)
2、TTS speaking speed control
3、more TTS emotion control
4、experiment about change sovits token inputs to probability distribution of vocabs
5、better English and Japanese text frontend
6、tiny version and larger-sized TTS models
7、colab scripts
# Requirments (How to install)
@ -77,8 +97,54 @@ to
tools/uvr5/uvr5_weights
# dataset format
The format of the TTS annotation .list file:
vocal path|speaker_name|language|text
e.g. D:\GPT-SoVITS\xxx/xxx.wav|xxx|en|I like playing Genshin.
language dictionary:
'zh': Chinese
"ja": Japanese
'en': English
# Credits
todo
https://github.com/innnky/ar-vits
https://github.com/yangdongchao/SoundStorm/tree/master/soundstorm/s1/AR
https://github.com/jaywalnut310/vits
https://github.com/hcy71o/TransferTTS/blob/master/models.py#L556
https://github.com/TencentGameMate/chinese_speech_pretrain
https://github.com/auspicious3000/contentvec/
https://github.com/jik876/hifi-gan
https://huggingface.co/hfl/chinese-roberta-wwm-ext-large
https://github.com/fishaudio/fish-speech/blob/main/tools/llama/generate.py#L41
https://github.com/Anjok07/ultimatevocalremovergui
https://github.com/openvpi/audio-slicer
https://github.com/cronrpc/SubFix
https://modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch
https://github.com/FFmpeg/FFmpeg
https://github.com/gradio-app/gradio