You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for the great work you've done on this model! Is there any way to batch the model using funasr? I've been trying to batch with padding and set the padding_mask to mask out the unused frames, but I'm not getting the same results as when I run inference sequentially.
Here's a sample of the code I'm using. I've tried a number of different configurations of arguments - there are several mask parameters, and it seems like mask refers to the MLM pretraining schema, and padding_mask refers to the attention mask? I'm not sure though because there's no documentation. Any guidance would be appreciated.
fromfunasr.utils.load_utilsimportload_audio_text_image_videofromfunasrimportAutoModelfromtorch.nn.utils.rnnimportpad_sequencemodel=AutoModel(model="iic/emotion2vec_plus_large").modelmodel.eval()
model.to("cuda")
padding_value=-1# Audios is a list of audio tensors resampled to 16kHzx=load_audio_text_image_video(audios)
x= [torch.nn.functional.layer_norm(x_, x_.shape).squeeze() forx_inx]
masked_x=pad_sequence(x, batch_first=True, padding_value=padding_value)
mask=masked_x==padding_valueout=model.extract_features(masked_x, mask=False, padding_mask=mask, remove_extra_tokens=True)
out_mask=out["padding_mask"]
feats=out["x"]
feats[out_mask] =0print(feats.sum(dim=1) / (~out_mask).sum(dim=1).unsqueeze(-1))
The text was updated successfully, but these errors were encountered:
Hi,
Thank you for the great work you've done on this model! Is there any way to batch the model using
funasr
? I've been trying to batch with padding and set thepadding_mask
to mask out the unused frames, but I'm not getting the same results as when I run inference sequentially.Here's a sample of the code I'm using. I've tried a number of different configurations of arguments - there are several
mask
parameters, and it seems likemask
refers to the MLM pretraining schema, andpadding_mask
refers to the attention mask? I'm not sure though because there's no documentation. Any guidance would be appreciated.The text was updated successfully, but these errors were encountered: