Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataCollatorMusicGenWithPadding might have a bug #4

Open
LiuZH-19 opened this issue May 5, 2024 · 1 comment
Open

DataCollatorMusicGenWithPadding might have a bug #4

LiuZH-19 opened this issue May 5, 2024 · 1 comment

Comments

@LiuZH-19
Copy link

LiuZH-19 commented May 5, 2024

Thank you very much for your amazing work!
While using melody-conditioned generation, I encountered the following error in the DataCollatorMusicGenWithPadding class:

batch[self.feature_extractor_input_name : input_values]
TypeError: unhashable type: 'slice'

input_values here is actually a dictionary. I resolved the error by changing the code to batch.update(input_values). Could you kindly confirm if this approach is correct?

def __call__(
        self, features: List[Dict[str, Union[List[int], torch.Tensor]]]
    ) -> Dict[str, torch.Tensor]:
        # split inputs and labels since they have to be of different lengths and need
        # different padding methods
        labels = [
            torch.tensor(feature["labels"]).transpose(0, 1) for feature in features
        ]
        # (bsz, seq_len, num_codebooks)
        labels = torch.nn.utils.rnn.pad_sequence(
            labels, batch_first=True, padding_value=-100
        )

        input_ids = [{"input_ids": feature["input_ids"]} for feature in features]
        input_ids = self.processor.tokenizer.pad(input_ids, return_tensors="pt")

        batch = {"labels": labels, **input_ids}

        if self.feature_extractor_input_name in features[0]:
            input_values = [
                {
                    self.feature_extractor_input_name: feature[
                        self.feature_extractor_input_name
                    ]
                }
                for feature in features
            ]
            input_values = self.processor.feature_extractor.pad(
                input_values, return_tensors="pt"
            )
            batch.update(input_values)
            # batch[self.feature_extractor_input_name : input_values]

        return batch
@hieuhthh
Copy link

I meet the same issue, can you fix it. And how to inference with this audio and prompt together? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants