GPT-SoVITS

mirror of https://github.com/RVC-Boss/GPT-SoVITS.git synced 2026-06-04 05:01:27 +08:00

Author	SHA1	Message	Date
lsh	551d3dc281	Fix s1_train DDP crash on Windows single-GPU (sm_120 / Blackwell) On Windows with a single GPU running CUDA 12.8 + PyTorch 2.7+ on Blackwell (sm_120) hardware, s1_train.py crashes with an access violation (exit code 3221225477) shortly after pytorch_lightning's Trainer initialization, before the first batch runs. Root cause: DDPStrategy with the gloo backend is forced on Windows even when there's only one GPU. The gloo + sm_120 + CUDA 12.8 combination has a known incompatibility (see PyTorch forum "[Solved] RTX 5090 sm_120 Training Segfault - DDP Was the Cause") that produces a native crash inside the Lightning training loop. Two changes, scoped to Windows + CUDA only: * GPT_SoVITS/s1_train.py: on Windows, use Lightning's "auto" strategy, which picks `single_device` for one GPU and skips DDP entirely. Also pin devices=1 on Windows so multi-GPU users don't accidentally enable DDP. Non-Windows behaviour is unchanged (NCCL DDP, all available GPUs). * GPT_SoVITS/AR/data/bucket_sampler.py: when the distributed process group isn't initialized (i.e. running under single_device strategy), fall back to a single-replica configuration instead of crashing in dist.get_world_size(). Defensive change — behaviour is unchanged when DDP is properly initialized. Tested on: * Windows 11 + RTX 5090 (sm_120) + CUDA 12.8 + PyTorch 2.11+cu128 15-epoch s1 training completes cleanly, weights saved as expected. Closes #2626.	2026-05-16 19:10:20 -07:00
XXXXRT666	53cac93589	Refactor: Format Code with Ruff and Update Deprecated G2PW Link (#2255 ) * ruff check --fix * ruff format --line-length 120 --target-version py39 * Change the link for G2PW Model * update pytorch version and colab	2025-04-07 16:42:47 +08:00
RVC-Boss	e937b625e4	support sovits v3 lora training, 8G GPU memory is enough support sovits v3 lora training, 8G GPU memory is enough	2025-02-23 00:37:14 +08:00
RVC-Boss	fa42d26d0e	gpt_sovits_v3 gpt_sovits_v3	2025-02-11 21:07:03 +08:00
huangxu1991	4f8e1660af	Add use_distributed_sampler=False in Trainer (#756 ) if you have defined your own sampler, you should have to set use_distributed_sampler to False! 当使用自定义的 sampler 时，必须设置 use_distributed_sampler 为 False	2024-07-19 10:33:24 +08:00
RVC-Boss	a208698e77	Update s1_train.py	2024-06-29 22:54:05 +08:00
Lion	1963eb01cc	support cpu training, use cpu training on mac	2024-03-13 22:24:32 +08:00
RVC-Boss	e97cc3346a	模型实验名可设置为中文。 fix https://github.com/RVC-Boss/GPT-SoVITS/issues/500	2024-02-17 16:45:31 +08:00
RVC-Boss	59f35adad8	修复gpt训练卡死问题和unmatched '}' in format string问题修复gpt训练卡死问题和unmatched '}' in format string问题	2024-02-08 21:53:31 +08:00
RVC-Boss	f0cfe39708	fix gpt not save issue.	2024-01-28 19:34:03 +08:00
Wu Zichen	07a5339691	mps support	2024-01-24 19:37:47 +08:00
Wu Zichen	8069264e64	mps support	2024-01-24 17:30:49 +08:00
Blaise	0d92575115	Code refactor + remove unused imports	2024-01-16 17:10:27 +01:00
RVC-Boss	41ca6028d6	Add files via upload	2024-01-16 17:38:48 +08:00

14 Commits