GPT-SoVITS

mirror of https://github.com/RVC-Boss/GPT-SoVITS.git synced 2025-12-17 01:59:08 +08:00

Author	SHA1	Message	Date
zpeng11	fe7bccaa97	Merge 8858492f56326dce521db2c4a4b3a7323e786596 into 9ec3a60f30d228719e5ec6cd6796c5b2d888dd1a	2025-12-01 22:52:14 -08:00
RVC-Boss	9ec3a60f30	Update config.py	2025-12-01 20:23:49 +08:00
RVC-Boss	fc533b6fb7	Update fasterwhisper_asr.py	2025-12-01 11:38:37 +08:00
XXXXRT666	857799276c	Fix Modelscope (#2679 )	2025-12-01 11:13:15 +08:00
Spr_Aachen	92d2d337fd	Fix training error caused by float type of default_batch_size parameter (#2662 )	2025-11-28 22:53:43 +08:00
ChasonJiang	6fb441f65e	更友好的流模式选项 (#2678 )	2025-11-28 22:13:48 +08:00
XXXXRT666	c85c54eca9	Add ModelScope Snapshot Download For ASR (#2627 ) * Add ModelScope Snapshot Download For ASR * Typo Fix * Remove YUE in whisper * Remove HF ENDPOINT * Add FunASR Download	2025-11-28 22:10:49 +08:00
RVC-Boss	cb00840c4e	Add files via upload	2025-11-28 22:02:03 +08:00
wzy3650	60a4a214af	vq distributed training support (#2577 ) Co-authored-by: wangzeyuan <wangzeyuan@agora.io>	2025-11-28 21:57:13 +08:00
zzz	6375bbe316	尝试 stream infer (#2469 ) * 尝试 stream infer * 在 stream_infer 脚本中绘制生成的音频 * stream_infer 增加导出部分。 * stream_infer: 更方便找规律的图 * stream_infer: 在拼接音频时进行相关性搜索，减少拼接带来基频断裂的情况 * stream_infer: 导出 `find_best_audio_offset_fast` * stream_infer: 优化波形显示，方便对比 * stream_v2pro.py 从命令行读取参数 * stream_v2pro.py 减少用于导出的文本长度 * stream_v2pro: 修复由于 spectrogram_torch 输入是 half 导致 spec 溢出最终没有声音的问题 * stream_v2pro: 新增 --lang 参数提示参考文字的语言类型	2025-11-28 21:36:57 +08:00
KamioRinn	e00ca92140	Fix ASMD (#2636 )	2025-11-28 21:22:43 +08:00
ChasonJiang	92ab59c553	更细粒度的流式推理模式 (#2671 ) * 更好的流式推理模式 * 清理无用代码 * modified: GPT_SoVITS/AR/models/t2s_model.py modified: GPT_SoVITS/TTS_infer_pack/TTS.py modified: GPT_SoVITS/module/models.py * modified: GPT_SoVITS/TTS_infer_pack/TTS.py * modified: .gitignore modified: GPT_SoVITS/AR/models/t2s_model.py modified: GPT_SoVITS/TTS_infer_pack/TTS.py modified: GPT_SoVITS/module/models.py * modified: GPT_SoVITS/AR/models/t2s_model.py modified: GPT_SoVITS/TTS_infer_pack/TTS.py modified: GPT_SoVITS/module/models.py modified: api_v2.py * modified: GPT_SoVITS/TTS_infer_pack/TTS.py * 更正拼写错误 * 支持固定chunk长度的流式推理，优化sola算法 * 修复api_v2的ogg格式传输问题	2025-11-28 21:12:41 +08:00
RVC-Boss	11aa78bd9b	修复环境变量可能不为str的问题修复环境变量可能不为str的问题	2025-09-10 15:01:04 +08:00
zpeng11	8858492f56	feat:enable speed control for v1v2	2025-08-31 00:14:07 -04:00
zpeng11	337da7454e	feat:remove prints	2025-08-26 17:09:36 -04:00
zpeng11	3e63595f0e	feat:update kv cache to [len, head, dim] to allow linear size increasement	2025-08-26 17:03:29 -04:00
zpeng11	fa84e262ae	feat:remove unneed for main	2025-08-25 22:53:15 -04:00
zpeng11	968ac4c264	feat: solved problem, export works	2025-08-25 22:37:52 -04:00
zpeng11	419909b443	failed , testing expand y	2025-08-25 21:57:36 -04:00
zpeng11	c85ee3d521	feat:successfully unified first step and following step	2025-08-25 17:57:04 -04:00
zpeng11	d413a4f5b1	run time working	2025-08-25 17:07:38 -04:00
zpeng11	26228402e3	feat:solve unified kv cache shape handling, todo: clean up upper level to unify first and following step	2025-08-25 12:06:26 -04:00
zpeng11	0c5f61f98c	feat:rename and features to onnx export	2025-08-25 01:46:53 -04:00
zpeng11	633e478b24	feat:clean up playground	2025-08-24 02:37:34 -04:00
zpeng11	942caa888e	feat:supporting half export	2025-08-24 02:29:33 -04:00
zpeng11	72c5d3224e	utility updates	2025-08-24 02:11:47 -04:00
zpeng11	48d52778ce	feat:clean up export logics and add notes	2025-08-24 02:00:32 -04:00
zpeng11	e4d1894a8f	feat:experiments with for onnx with attention, but does not work well todo:clean code and try v3v4	2025-08-24 00:46:29 -04:00
zpeng11	5982080939	feat:updated fsdecode and decoder interface	2025-08-23 17:35:21 -04:00
zpeng11	b45cbc3561	feat: sampling params working now for export, todo:fold weights clean code	2025-08-23 13:03:02 -04:00
zpeng11	9ed42daa88	feat: allow fsdec and sdec to have sampling parames	2025-08-23 12:17:04 -04:00
zpeng11	3ccd1c0ea3	fix: solved t2s ending problem, and verified infer&forward has same output under deterministic random, fixed topk to 15	2025-08-23 03:31:01 -04:00
zpeng11	63cbb6efa7	verified audio, text, synthesizer all working, todo:dig into t2s for error	2025-08-22 01:36:58 -04:00
zpeng11	e8fdf472c0	feat:onnx friendly loop with same function	2025-08-21 22:23:50 -04:00
zpeng11	77794a5923	feat:export onnx with combined graph ready, todo:link weights in onnx graph	2025-08-21 01:52:34 -04:00
zpeng11	16d30ce1e4	feat:get ready for if node merge	2025-08-21 00:34:56 -04:00
zpeng11	403c5bf320	feat:v1v2 both works for export	2025-08-20 20:57:29 -04:00
zpeng11	bc7fe01876	feat:update v2pro's gpt path, todo:work on v1 transform	2025-08-20 20:41:20 -04:00
zpeng11	bb529e7e4a	update namings	2025-08-20 20:21:42 -04:00
zpeng11	4e0cc57052	update init_step name	2025-08-20 20:05:07 -04:00
zpeng11	aafa0561d8	correctly setup onnx export, solved problem	2025-08-20 19:39:10 -04:00
zpeng11	94b31a250f	limit vits for one input a time	2025-08-20 18:39:21 -04:00
zpeng11	fd0fb35a49	fix spectrum take out working	2025-08-20 18:32:38 -04:00
zpeng11	911c53b1ee	fixed using hubert for full run, 80 works	2025-08-20 17:37:41 -04:00
zpeng11	1cdd41d877	fixed resample audio and tested in full run	2025-08-20 16:47:55 -04:00
zpeng11	da5aa78224	feat:combined fsdc and encoder, todo:extract audio pipeline	2025-08-20 02:24:59 -04:00
zpeng11	71cbe28e68	feat:optimize looping	2025-08-19 21:31:42 -04:00
zpeng11	5c08328cf3	feat:voice and text preprocess system verifed, todo:dissasemble onnx export of gsv	2025-08-19 21:10:21 -04:00
zpeng11	dd156f15aa	feat:clean up playground explore audio preprocess, todo:build free run from pure input data	2025-08-19 01:22:00 -04:00
zpeng11	aef9d26580	feat:text_bert and audio_hubert exports are ready and fully tested, todo:solve dependancy in playground runs	2025-08-19 00:05:45 -04:00

1 2 3 4 5 ...

1064 Commits