Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speaker voice is not consistent across different generation #112

Open
swapsmagic opened this issue Aug 19, 2024 · 2 comments
Open

Speaker voice is not consistent across different generation #112

swapsmagic opened this issue Aug 19, 2024 · 2 comments

Comments

@swapsmagic
Copy link

swapsmagic commented Aug 19, 2024

Tried Laura speaks slightly faster than normal with slightly expressive monotone voice with a hint of excitement. with different text and the voice is drastically different. How can this be fixed? Is there a specific technique that helps keep the voice consistent?

@jdola
Copy link

jdola commented Aug 19, 2024

That's right, the disadvantage is too big.

@Guppy16
Copy link

Guppy16 commented Aug 21, 2024

Please see my attempt at alleviating this issue in this PR: #110

The idea is to try and "prefix" the TTS with some audio to get it to mimick the prosody as it generates more audio.

(edit) I've also added a notebook in the PR to demonstrate how you could do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants