Speaker voice is not consistent across different generation #112

swapsmagic · 2024-08-19T01:28:10Z

Tried Laura speaks slightly faster than normal with slightly expressive monotone voice with a hint of excitement. with different text and the voice is drastically different. How can this be fixed? Is there a specific technique that helps keep the voice consistent?

The text was updated successfully, but these errors were encountered:

jdola · 2024-08-19T12:32:24Z

That's right, the disadvantage is too big.

Guppy16 · 2024-08-21T08:59:27Z

Please see my attempt at alleviating this issue in this PR: #110

The idea is to try and "prefix" the TTS with some audio to get it to mimick the prosody as it generates more audio.

(edit) I've also added a notebook in the PR to demonstrate how you could do this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speaker voice is not consistent across different generation #112

Speaker voice is not consistent across different generation #112

swapsmagic commented Aug 19, 2024 •

edited

Loading

jdola commented Aug 19, 2024

Guppy16 commented Aug 21, 2024 •

edited

Loading

Speaker voice is not consistent across different generation #112

Speaker voice is not consistent across different generation #112

Comments

swapsmagic commented Aug 19, 2024 • edited Loading

jdola commented Aug 19, 2024

Guppy16 commented Aug 21, 2024 • edited Loading

swapsmagic commented Aug 19, 2024 •

edited

Loading

Guppy16 commented Aug 21, 2024 •

edited

Loading