A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS
text-to-speech
deep-learning
unsupervised
end-to-end
pytorch
tts
speech-synthesis
jets
multi-speaker
sota
single-speaker
neural-tts
non-autoregressive
fastspeech2
hifi-gan
non-ar
ultimate-tts
text-to-wav
-
Updated
Jun 6, 2022 - Python