Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: 【TTS】The source audio has no interference, the generated audio is blurry, and the pronunciation of individual characters is correct #343

Open
DecodeW opened this issue Nov 11, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@DecodeW
Copy link

DecodeW commented Nov 11, 2024

Is your feature request related to a problem? Please describe.

Creating autio by using tts,some can be correct,others are not blurry,even isn't human language. 10 words only 2-3 correct.

Describe the solution you'd like

1.generate a normal audio and no strange voice.

Describe alternatives you've considered

1.I edited the dot format to match language and blank,it has a little improvement.But i think it is not the main problem.
2.I changed target_len to different value and None, no useful.

Additional context

audio-issue

@DecodeW DecodeW added the enhancement New feature or request label Nov 11, 2024
@yuantuo666
Copy link
Collaborator

Hi, MaskGCT is trained on Emilia, which means its training data is in 3-30 seconds. It seems your prompt text is a little bit lengthy and missing punctuation. Maybe you can try:

  1. Cut the prompt_wav into 3-10 seconds. (Make sure the total length of prompt wav and generated wav is within 30 seconds)
  2. Adding punctuation for prompt_text.

@yuantuo666
Copy link
Collaborator

Related issue: #305

@DecodeW
Copy link
Author

DecodeW commented Nov 27, 2024

I have tried these method that you told. The result is also same as before.Just a little improvment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants