1064 Commits

Author SHA1 Message Date
zpeng11
fe7bccaa97
Merge 8858492f56326dce521db2c4a4b3a7323e786596 into 9ec3a60f30d228719e5ec6cd6796c5b2d888dd1a 2025-12-01 22:52:14 -08:00
RVC-Boss
9ec3a60f30
Update config.py 2025-12-01 20:23:49 +08:00
RVC-Boss
fc533b6fb7
Update fasterwhisper_asr.py 2025-12-01 11:38:37 +08:00
XXXXRT666
857799276c
Fix Modelscope (#2679) 2025-12-01 11:13:15 +08:00
Spr_Aachen
92d2d337fd
Fix training error caused by float type of default_batch_size parameter (#2662) 2025-11-28 22:53:43 +08:00
ChasonJiang
6fb441f65e
更友好的流模式选项 (#2678) 2025-11-28 22:13:48 +08:00
XXXXRT666
c85c54eca9
Add ModelScope Snapshot Download For ASR (#2627)
* Add ModelScope Snapshot Download For ASR

* Typo Fix

* Remove YUE in whisper

* Remove HF ENDPOINT

* Add FunASR Download
2025-11-28 22:10:49 +08:00
RVC-Boss
cb00840c4e
Add files via upload 2025-11-28 22:02:03 +08:00
wzy3650
60a4a214af
vq distributed training support (#2577)
Co-authored-by: wangzeyuan <wangzeyuan@agora.io>
2025-11-28 21:57:13 +08:00
zzz
6375bbe316
尝试 stream infer (#2469)
* 尝试 stream infer

* 在 stream_infer 脚本中绘制生成的音频

* stream_infer 增加导出部分。

* stream_infer: 更方便找规律的图

* stream_infer: 在拼接音频时进行相关性搜索,减少拼接带来基频断裂的情况

* stream_infer: 导出 `find_best_audio_offset_fast`

* stream_infer: 优化波形显示,方便对比

* stream_v2pro.py 从命令行读取参数

* stream_v2pro.py 减少用于导出的文本长度

* stream_v2pro: 修复由于 spectrogram_torch 输入是 half 导致 spec 溢出最终没有声音的问题

* stream_v2pro: 新增 --lang 参数提示参考文字的语言类型
2025-11-28 21:36:57 +08:00
KamioRinn
e00ca92140
Fix ASMD (#2636) 2025-11-28 21:22:43 +08:00
ChasonJiang
92ab59c553
更细粒度的流式推理模式 (#2671)
* 更好的流式推理模式

* 清理无用代码

* modified:   GPT_SoVITS/AR/models/t2s_model.py
	modified:   GPT_SoVITS/TTS_infer_pack/TTS.py
	modified:   GPT_SoVITS/module/models.py

* modified:   GPT_SoVITS/TTS_infer_pack/TTS.py

* modified:   .gitignore
	modified:   GPT_SoVITS/AR/models/t2s_model.py
	modified:   GPT_SoVITS/TTS_infer_pack/TTS.py
	modified:   GPT_SoVITS/module/models.py

* modified:   GPT_SoVITS/AR/models/t2s_model.py
	modified:   GPT_SoVITS/TTS_infer_pack/TTS.py
	modified:   GPT_SoVITS/module/models.py
	modified:   api_v2.py

* modified:   GPT_SoVITS/TTS_infer_pack/TTS.py

* 更正拼写错误

* 支持固定chunk长度的流式推理,优化sola算法

* 修复api_v2的ogg格式传输问题
2025-11-28 21:12:41 +08:00
RVC-Boss
11aa78bd9b
修复环境变量可能不为str的问题
修复环境变量可能不为str的问题
2025-09-10 15:01:04 +08:00
zpeng11
8858492f56 feat:enable speed control for v1v2 2025-08-31 00:14:07 -04:00
zpeng11
337da7454e feat:remove prints 2025-08-26 17:09:36 -04:00
zpeng11
3e63595f0e feat:update kv cache to [len, head, dim] to allow linear size increasement 2025-08-26 17:03:29 -04:00
zpeng11
fa84e262ae feat:remove unneed for main 2025-08-25 22:53:15 -04:00
zpeng11
968ac4c264 feat: solved problem, export works 2025-08-25 22:37:52 -04:00
zpeng11
419909b443 failed , testing expand y 2025-08-25 21:57:36 -04:00
zpeng11
c85ee3d521 feat:successfully unified first step and following step 2025-08-25 17:57:04 -04:00
zpeng11
d413a4f5b1 run time working 2025-08-25 17:07:38 -04:00
zpeng11
26228402e3 feat:solve unified kv cache shape handling, todo: clean up upper level to unify first and following step 2025-08-25 12:06:26 -04:00
zpeng11
0c5f61f98c feat:rename and features to onnx export 2025-08-25 01:46:53 -04:00
zpeng11
633e478b24 feat:clean up playground 2025-08-24 02:37:34 -04:00
zpeng11
942caa888e feat:supporting half export 2025-08-24 02:29:33 -04:00
zpeng11
72c5d3224e utility updates 2025-08-24 02:11:47 -04:00
zpeng11
48d52778ce feat:clean up export logics and add notes 2025-08-24 02:00:32 -04:00
zpeng11
e4d1894a8f feat:experiments with for onnx with attention, but does not work well todo:clean code and try v3v4 2025-08-24 00:46:29 -04:00
zpeng11
5982080939 feat:updated fsdecode and decoder interface 2025-08-23 17:35:21 -04:00
zpeng11
b45cbc3561 feat: sampling params working now for export, todo:fold weights clean code 2025-08-23 13:03:02 -04:00
zpeng11
9ed42daa88 feat: allow fsdec and sdec to have sampling parames 2025-08-23 12:17:04 -04:00
zpeng11
3ccd1c0ea3 fix: solved t2s ending problem, and verified infer&forward has same output under deterministic random, fixed topk to 15 2025-08-23 03:31:01 -04:00
zpeng11
63cbb6efa7 verified audio, text, synthesizer all working, todo:dig into t2s for error 2025-08-22 01:36:58 -04:00
zpeng11
e8fdf472c0 feat:onnx friendly loop with same function 2025-08-21 22:23:50 -04:00
zpeng11
77794a5923 feat:export onnx with combined graph ready, todo:link weights in onnx graph 2025-08-21 01:52:34 -04:00
zpeng11
16d30ce1e4 feat:get ready for if node merge 2025-08-21 00:34:56 -04:00
zpeng11
403c5bf320 feat:v1v2 both works for export 2025-08-20 20:57:29 -04:00
zpeng11
bc7fe01876 feat:update v2pro's gpt path, todo:work on v1 transform 2025-08-20 20:41:20 -04:00
zpeng11
bb529e7e4a update namings 2025-08-20 20:21:42 -04:00
zpeng11
4e0cc57052 update init_step name 2025-08-20 20:05:07 -04:00
zpeng11
aafa0561d8 correctly setup onnx export, solved problem 2025-08-20 19:39:10 -04:00
zpeng11
94b31a250f limit vits for one input a time 2025-08-20 18:39:21 -04:00
zpeng11
fd0fb35a49 fix spectrum take out working 2025-08-20 18:32:38 -04:00
zpeng11
911c53b1ee fixed using hubert for full run, 80 works 2025-08-20 17:37:41 -04:00
zpeng11
1cdd41d877 fixed resample audio and tested in full run 2025-08-20 16:47:55 -04:00
zpeng11
da5aa78224 feat:combined fsdc and encoder, todo:extract audio pipeline 2025-08-20 02:24:59 -04:00
zpeng11
71cbe28e68 feat:optimize looping 2025-08-19 21:31:42 -04:00
zpeng11
5c08328cf3 feat:voice and text preprocess system verifed, todo:dissasemble onnx export of gsv 2025-08-19 21:10:21 -04:00
zpeng11
dd156f15aa feat:clean up playground explore audio preprocess, todo:build free run from pure input data 2025-08-19 01:22:00 -04:00
zpeng11
aef9d26580 feat:text_bert and audio_hubert exports are ready and fully tested, todo:solve dependancy in playground runs 2025-08-19 00:05:45 -04:00