English | [中文](README-ZH.md)
SynthereTTS introduces unique techniques into a simple user interface to provide more realistic speech synthesis and voice cloning applications .
Synther TTS has the following features:
-
Optional basic and advanced models: The basic model uses the pre-trained vits model, and the advanced version is built on GPT- On top of the SoVITS model.
-
Emphasis can be set: Use [] to arbitrarily set the words or phrases that need to be emphasized.
-
Noise Suppression: The noise suppression function can adaptively suppress the generated noise and improve the quality of the generated audio.
-
Cloning and cross-language speech synthesis: In addition to training the model through the speaker's voice, you can also directly use the speaker's reference audio that is different from the pre-trained model to synthesize a voice with reference audio timbre.
-
Easy to use: The front-end interface is rewritten using pyqt, which is efficient and eliminates complicated parameter settings.
Can run on CPU or GPU. The CPU is slower when running the high-end version. If you use GPU to run, you need at least 8GB of video memory; for CPU running, 16G or above is recommended.
Language | Status |
---|---|
English (en) | ✅ |
Chinese (zh) | ✅ |
- The vits model used in the basic version can be downloaded from the link provided by sherpa-onnx; the advanced version of the pre-trained model can be downloaded from Download from the link provided in GPT_SoVITS, or follow the prompts to train your own model.
- When the text length increases, the generation time of high-order models will increase significantly. Although longer text synthesis can be achieved by modifying the constraints, it is still recommended to split the text into segments.
- Add intonation control
- Basic version model adds emphasis and noise suppression
- Increase emphasis level, provide hot word mapping with different emphasis levels
- More ...
- [ ]
- Controllable Emphasis with zero data for text-to-speechr Insights that emphasize control
- sherpa-onnx Efficient and easy-to-use VITS model
- GPT_SoVITS provides an excellent Chinese speech synthesis model