Subtle jitter in generated speech? #636
ragequitninja
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
When training a new TTS model (using LJSpeech dataset) at high quality 22.5kHz I notice that some parts of sentences has a very slight jitter when doing inference. It is very subtle but in some ways annoying because it sounds unnatural. I am already at epoch 2500+
Understandably increasing model params will make the model larger and slower on CPU but before wasting more GPU resources, I wonder if anyone has already tried to train a larger model (e.g. increasing n_heads and/or n_layers)? If yes, has it helped overall quality of the speaking in terms of it being more natural sounding?
Or am I thinking about this the wrong way?
Beta Was this translation helpful? Give feedback.
All reactions