Update README.md

2026-01-05 17:23:44 +08:00 · 2024-01-16 22:29:50 +08:00 · 2024-01-16 22:29:50 +08:00 · 23b5979889
commit 23b5979889
parent 0e2467ace4
1 changed files with 69 additions and 3 deletions
--- a/README.md
+++ b/README.md
@ -4,11 +4,31 @@ I am organizing and uploading the codes. It will be public in one day.
 https://www.bilibili.com/video/BV12g4y1m7Uw/
-todo
+features:
 1、input 5s vocal, zero shot TTS
 2、1min training dataset, fine tune (few shot TTS. The TTS model trained using few-shot techniques exhibits significantly better similarity and realism in the speaker's voice compared to zero-shot.)
 3、Cross lingual (inference another language that is different from the training dataset language), now support English, Japanese and Chinese
 4、This WebUI integrates tools such as voice accompaniment separation, automatic segmentation of training sets, Chinese ASR, text labeling, etc., to help beginners quickly create their own training datasets and GPT/SoVITS models.
 # todolist
-todo
+1、zero shot voice conversion(5s) /few shot voice converion(1min)
 2、TTS speaking speed control
 3、more TTS emotion control
 4、experiment about change sovits token inputs to probability distribution of vocabs
 5、better English and Japanese text frontend
 6、tiny version and larger-sized TTS models
 7、colab scripts
 # Requirments (How to install)
@ -77,8 +97,54 @@ to
 tools/uvr5/uvr5_weights
 # dataset format
 The format of the TTS annotation .list file:
 vocal path|speaker_name|language|text
 e.g. D:\GPT-SoVITS\xxx/xxx.wav|xxx|en|I like playing Genshin.
 language dictionary:
    'zh': Chinese
    "ja": Japanese
    'en': English
 # Credits
-todo
+https://github.com/innnky/ar-vits
 https://github.com/yangdongchao/SoundStorm/tree/master/soundstorm/s1/AR
 https://github.com/jaywalnut310/vits
 https://github.com/hcy71o/TransferTTS/blob/master/models.py#L556
 https://github.com/TencentGameMate/chinese_speech_pretrain
 https://github.com/auspicious3000/contentvec/
 https://github.com/jik876/hifi-gan
 https://huggingface.co/hfl/chinese-roberta-wwm-ext-large
 https://github.com/fishaudio/fish-speech/blob/main/tools/llama/generate.py#L41
 https://github.com/Anjok07/ultimatevocalremovergui
 https://github.com/openvpi/audio-slicer
 https://github.com/cronrpc/SubFix
 https://modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch
 https://github.com/FFmpeg/FFmpeg
 https://github.com/gradio-app/gradio