GPT-SoVITS

mirror of https://github.com/RVC-Boss/GPT-SoVITS.git synced 2026-06-05 22:08:15 +08:00

Author	SHA1	Message	Date
__kaning123__	24a03fb884	Merge 60414d25a39f3786a392523297734c144d1c59a9 into 445d18ccce0b4ea7cb6f8c93ff688b662bc61338	2026-04-18 17:17:01 +08:00
huang yutong	445d18ccce	fix: 修复 TTS 音频后处理中的多个缺陷 (#2753 ) 1. 修复音频超采样时 int16 双重转换导致整数溢出（CRITICAL） - audio_postprocess 中 `audio = (audio * 32768).astype(np.int16)` 位于 if/else 块之外无条件执行，当 super_sampling=True 时音频已在分支内转为 int16，再次乘以 32768 导致溢出和音频完全失真 - 同时修复 super_sampling=True 但超分模型不存在时 torch.Tensor 调用 .astype() 的 AttributeError 2. 修复 batched vocoder 推理中 padding_len=0 导致音频丢失（HIGH） - 当 padding_len 恰好为 0 时，`-0 * upsample_rate == 0`，切片 `audio[x:0]` 返回空张量，导致整段音频丢失 3. 修复文件不存在时错误地抛出 FileExistsError（LOW） - 应为 FileNotFoundError Made-with: Cursor	2026-04-18 17:16:24 +08:00
Mushroomcowisheggs	00ce973412	feat: 添加数据集的错误处理提示 (#2758 ) Co-authored-by: moomushroom <107208254+moomushroom@users.noreply.github.com>	2026-04-18 17:13:30 +08:00
huang yutong	14191901cd	fix: 修复多个模块中的独立 bug (#2755 ) 1. 修复 sync_buffer 中除以函数对象而非调用结果（distrib.py） - `buffer.data /= world_size` 中 world_size 是函数，缺少 ()，导致 TypeError 使分布式训练 buffer 同步失败 2. 修复 istft 函数缺少 return 语句（spec_utils.py） - 函数计算了结果但未返回，调用者始终得到 None 3. 修复 cut0 返回字面量 "/n" 而非换行符 "\n"（text_segmentation_method.py） - 导致后续 text.split("\n") 无法正确切分，字面 /n 被当作文本内容 4. 修复粤语 ASR 的 vad/punc model_revision 被无条件覆盖（funasr_asr.py） - 粤语分支将 vad_model_revision 设为空（因不使用 VAD/标点模型），但 if/else 外的赋值将其覆盖为 "v2.0.4"，传入错误的 revision 参数 Made-with: Cursor	2026-04-18 17:10:56 +08:00
东云	780383d5bd	[codex] Improve Windows single-GPU v3 LoRA training / 改进 Windows 单卡 v3 LoRA 训练流程 (#2767 ) * Improve Windows single-GPU v3 LoRA training * Drop unrelated checkpoint helper change from PR * Tighten PR scope to single-GPU training path fixes	2026-04-18 16:54:26 +08:00
白菜工厂1145号员工	ba8de9b760	优化 G2PW 的推理输入构造与多音字处理流程，减少重复计算，降低长句场景下的推理开销 (#2763 ) * Enhance G2P processing by implementing batch input handling in _g2p function, improving efficiency. Update prepare_onnx_input to utilize caching for tokenization and add optional parameters for character ID mapping and phoneme masks. Refactor G2PWOnnxConverter to streamline model loading and configuration management. * Enhance G2PW model input handling by introducing polyphonic context character support and updating the data preparation method to return additional query IDs. This improves the processing of polyphonic characters in sentences.	2026-04-18 16:52:32 +08:00
__kaning123__	60414d25a3	Merge pull request #2 from kaning123/Dev Dev	2026-04-06 13:32:49 +08:00
Kaning123	e6a67650ff	feat: 添加中间量导出功能	2026-04-06 13:01:32 +08:00
Kaning123	24d7290c11	feat: Added VoiceChange.py	2026-04-06 12:59:31 +08:00
Kaning123	fb50fc090f	feat:Added batch tts option	2026-04-06 12:58:00 +08:00
Kaning123	cb2b844f45	feat: Added ReturnWay option to get_tts_wav	2026-04-04 14:17:07 +08:00
Kaning123	5c03499fcf	feat:向 VoiceSave 模块中添加 find_func	2026-04-02 17:26:08 +08:00
Kaning123	46ae12bf17	feat:添加关闭tts webui 的入口与 ge 等中间量的保存入口用于分发及使用	2026-04-02 17:24:19 +08:00
Kaning123	47170fd555	feat: 添加了向张量组文件中追加张量的功能	2026-03-29 11:10:28 +08:00
Kaning123	f3a9603eb0	style: move new entries to the middle of the page	2026-03-21 13:19:48 +08:00
Kaning123	5450922d8d	feat:Added entry to get value "ge" of class SynthesizerTrn	2026-03-19 17:39:55 +08:00
Kaning123	86ac5555e1	feat: Added webUI entries	2026-03-14 15:28:50 +08:00
Kaning123	e49d396b18	fix: 添加了inst.bat 与 inst2.ps1 以应对 install.ps1 运行时可能出现的 “由于调用深度溢出，脚本失败。” 错误	2026-03-14 13:28:46 +08:00
Kaning123	eedb06b303	fix:Fixed config.json loader in config.py	2026-03-14 13:01:11 +08:00
Kaning123	6e3db0126c	fix: Fixed conda-go-webui.bat	2026-03-14 12:59:09 +08:00
Kaning123	0e83383544	feat:added bat file for launching webui with conda	2026-03-14 09:32:11 +08:00
Kaning123	99a2e356f2	feat:remove “-q“ option of conda installation	2026-03-13 21:35:24 +08:00
__kaning123__	53b17bd2d2	Merge pull request #1 from kaning123/Dev Added ability to use preprocessed data to speed up tts efficiency	2026-02-25 14:01:46 +08:00
__kaning123__	69f1c9c2dd	feat: Added path check	2026-02-25 13:56:47 +08:00
__kaning123__	012eb93ef8	feat:添加了是否启用参考音频的变量	2026-02-25 10:37:33 +08:00
__kaning123__	f6e8ec8a78	feat:Added .voice loader	2026-02-25 10:20:48 +08:00
__kaning123__	1c54a945cb	feat: Added entrys to save sv_emb and refers	2026-02-25 07:53:03 +08:00
__kaning123__	a6a53f7231	feat: Added entry to disable checks	2026-02-24 07:48:12 +08:00
__kaning123__	a06011d838	fix:fix import errors	2026-02-23 14:29:40 +08:00
__kaning123__	6ef7c0b70f	feat: Add lib allows tensor saving	2026-02-23 09:51:55 +08:00
XXXXRT666	2d9193b0d3	Migrate to miniforge, add missing dependencies, update docker file, remove deprecated files (#2732 ) * Migrate to miniforge, add missing dependencies, update docker file, remove deprecated files * Add Env Vars and Secrets	2026-02-09 15:05:25 +08:00
Oarora	9986880b3f	fix Conda 条款未同意导致的构建失败 (#2727 )	2026-02-08 23:52:04 +08:00
ChasonJiang	c767f0b83b	修复bug (#2704 ) * 修复bug * fallbak and bug fix	2025-12-30 16:00:21 +08:00
ChasonJiang	9080a967d5	修复采样错误 (#2703 )	2025-12-30 15:21:03 +08:00
sushistack	51df9f7384	Fix model file name in README instructions (#2700 )	2025-12-25 16:44:21 +08:00
ChasonJiang	bfca0f6b2d	对齐naive_infer的解码策略，防止吞句 (#2697 )	2025-12-19 17:37:19 +08:00
ChasonJiang	abe984395c	对齐gpt topk默认采样参数 (#2696 )	2025-12-19 16:05:36 +08:00
RVC-Boss	cc89c3660e	Update requirements.txt	2025-12-19 15:54:54 +08:00
ChasonJiang	36b3231c6f	bug fix (#2689 )	2025-12-15 14:23:06 +08:00
RVC-Boss	9ec3a60f30	Update config.py	2025-12-01 20:23:49 +08:00
RVC-Boss	fc533b6fb7	Update fasterwhisper_asr.py	2025-12-01 11:38:37 +08:00
XXXXRT666	857799276c	Fix Modelscope (#2679 )	2025-12-01 11:13:15 +08:00
Spr_Aachen	92d2d337fd	Fix training error caused by float type of default_batch_size parameter (#2662 )	2025-11-28 22:53:43 +08:00
ChasonJiang	6fb441f65e	更友好的流模式选项 (#2678 )	2025-11-28 22:13:48 +08:00
XXXXRT666	c85c54eca9	Add ModelScope Snapshot Download For ASR (#2627 ) * Add ModelScope Snapshot Download For ASR * Typo Fix * Remove YUE in whisper * Remove HF ENDPOINT * Add FunASR Download	2025-11-28 22:10:49 +08:00
RVC-Boss	cb00840c4e	Add files via upload	2025-11-28 22:02:03 +08:00
wzy3650	60a4a214af	vq distributed training support (#2577 ) Co-authored-by: wangzeyuan <wangzeyuan@agora.io>	2025-11-28 21:57:13 +08:00
zzz	6375bbe316	尝试 stream infer (#2469 ) * 尝试 stream infer * 在 stream_infer 脚本中绘制生成的音频 * stream_infer 增加导出部分。 * stream_infer: 更方便找规律的图 * stream_infer: 在拼接音频时进行相关性搜索，减少拼接带来基频断裂的情况 * stream_infer: 导出 `find_best_audio_offset_fast` * stream_infer: 优化波形显示，方便对比 * stream_v2pro.py 从命令行读取参数 * stream_v2pro.py 减少用于导出的文本长度 * stream_v2pro: 修复由于 spectrogram_torch 输入是 half 导致 spec 溢出最终没有声音的问题 * stream_v2pro: 新增 --lang 参数提示参考文字的语言类型	2025-11-28 21:36:57 +08:00
KamioRinn	e00ca92140	Fix ASMD (#2636 )	2025-11-28 21:22:43 +08:00
ChasonJiang	92ab59c553	更细粒度的流式推理模式 (#2671 ) * 更好的流式推理模式 * 清理无用代码 * modified: GPT_SoVITS/AR/models/t2s_model.py modified: GPT_SoVITS/TTS_infer_pack/TTS.py modified: GPT_SoVITS/module/models.py * modified: GPT_SoVITS/TTS_infer_pack/TTS.py * modified: .gitignore modified: GPT_SoVITS/AR/models/t2s_model.py modified: GPT_SoVITS/TTS_infer_pack/TTS.py modified: GPT_SoVITS/module/models.py * modified: GPT_SoVITS/AR/models/t2s_model.py modified: GPT_SoVITS/TTS_infer_pack/TTS.py modified: GPT_SoVITS/module/models.py modified: api_v2.py * modified: GPT_SoVITS/TTS_infer_pack/TTS.py * 更正拼写错误 * 支持固定chunk长度的流式推理，优化sola算法 * 修复api_v2的ogg格式传输问题	2025-11-28 21:12:41 +08:00

1 2 3 4 5 ...

1062 Commits