Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The performance of MaskGCT is not meeting expectations. #348

Open
MonolithFoundation opened this issue Nov 15, 2024 · 3 comments
Open

The performance of MaskGCT is not meeting expectations. #348

MonolithFoundation opened this issue Nov 15, 2024 · 3 comments

Comments

@MonolithFoundation
Copy link

output.wav.zip

这个声音效果是啊会是啊?


    target_text = """大家好,我是雷军。我是小米科技的创始人,一个有梦想的年轻人。我在小米科技的前几年,一直是一个做硬件的人,现在我想做一个软件的人。我想做一个让每个人都能用上的手机,这是我的梦想。
现在,我想让每个人都开上汽车。
"""
    # inference
    infer(
        prompt_wav_path="data/leijun-prompt.wav",
        prompt_text="我第四次办年度演讲,前三次呢,前三次呢因为疫情的原因,都在小米科技园内举办。现场的人很少,这是第四次,我们仔细想了想,我们还是想办一个比较大的,",
        target_text=target_text,
        source_lang="zh",
        target_lang="zh",
        save_path="output/output.wav",
    )

infer就是ipytnotebook里面的代码,没有改任何东西。

@yuantuo666
Copy link
Collaborator

Hi, please provide the prompt WAV file so we can check better.

BTW, we prefer English issues: #304 (comment).

@yuantuo666 yuantuo666 changed the title MaskGCT效果不及预期 The performance of MaskGCT is not meeting expectations. Nov 18, 2024
@MonolithFoundation
Copy link
Author

Thank u!

I changed into another wav prompt, the result seems normal now. Still, want consult 2 questions:

  1. I want control the output audio length, is that possible to control? If the length given to short, will the voice fail to generate?
  2. Am wondering if the prompt voice have some background noice (such as background music), how will it effect the final result, any way to fix it?

@yuantuo666
Copy link
Collaborator

  1. Yes. You can specify the target_len parameter which is in second unit. The range could be 0.8x-1.2x of the normal duration. If the duration is too small, model might missing some words or fail to generate.
  2. Check out this: [Help]: Regarding the MaskGCT input params and the HF demo #305 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants