Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batching model inference #37

Open
Vedaad-Shakib opened this issue Jul 13, 2024 · 2 comments
Open

Batching model inference #37

Vedaad-Shakib opened this issue Jul 13, 2024 · 2 comments

Comments

@Vedaad-Shakib
Copy link

Hi,

Thank you for the great work you've done on this model! Is there any way to batch the model using funasr? I've been trying to batch with padding and set the padding_mask to mask out the unused frames, but I'm not getting the same results as when I run inference sequentially.

Here's a sample of the code I'm using. I've tried a number of different configurations of arguments - there are several mask parameters, and it seems like mask refers to the MLM pretraining schema, and padding_mask refers to the attention mask? I'm not sure though because there's no documentation. Any guidance would be appreciated.

from funasr.utils.load_utils import load_audio_text_image_video
from funasr import AutoModel
from torch.nn.utils.rnn import pad_sequence

model = AutoModel(model="iic/emotion2vec_plus_large").model
model.eval()
model.to("cuda")

padding_value = -1

# Audios is a list of audio tensors resampled to 16kHz
x = load_audio_text_image_video(audios)
x = [torch.nn.functional.layer_norm(x_, x_.shape).squeeze() for x_ in x]
masked_x = pad_sequence(x, batch_first=True, padding_value=padding_value)
mask = masked_x == padding_value

out = model.extract_features(masked_x, mask=False, padding_mask=mask, remove_extra_tokens=True)
out_mask = out["padding_mask"]
feats = out["x"]

feats[out_mask] = 0
print(feats.sum(dim=1) / (~out_mask).sum(dim=1).unsqueeze(-1))
@jiahaolu97
Copy link

I meet the same problem here. Looking forward to guidance for batched input processing

@ddlBoJack
Copy link
Owner

Hi, currently the model does not support batched input processing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants