Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed some errors to make it easier for first-time users. #72

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

MingHui-Fang
Copy link

@MingHui-Fang MingHui-Fang commented Jun 14, 2024

Actually, I ran into some issues when I used the official code, and I've seen other people raise the same issue, so maybe that's worth addressing.

Many users do not choose to use a 600M model due to GPU limitations, and will often use helpers/model_init_scripts/init_dummy_model_with_encodec.py or helpers/model_init_scripts/init_dummy_model.py.

However, the configurations in these two files have some potential problems for first-time users (Such as #63 (comment) and #66 (comment)), and it would probably be better if they could be fixed and comments added.

In addition, as users may change the model architecture, leading to errors, it might be possible to consider refinements training/run_parler_tts_training.py.

The above content is merely a personal suggestion. Please forgive any shortcomings.

@MingHui-Fang
Copy link
Author

There is a bug with the handling of the variables ratand lensin the code. As mentioned in #73 (comment), If the batch size is 1, after thesqueeze() process, the rat and lens will become 0-d tensor. So replacing squeeze() with reshape(-1) solves this problem simply.

if accelerator.is_main_process:
lab = generate_labels["labels"].cpu().transpose(1, 2).to(torch.int16)
rat = generate_labels["ratio"].cpu().squeeze()
lens = generate_labels["len_audio"].cpu().squeeze()
lab = [l[:, : int(ratio * length)] for (l, ratio, length) in zip(lab, rat, lens)]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant