-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ONNX? #55
Comments
We did not provide onnx model. Welcome contribute :) |
I'm currently working to understand the model's inputs and outputs. Could you provide detailed information to help others add onnx support? Specifically, I need the exact input and output details. Update: this is the example I was able to run with this repo '''
Using the emotion representation model
rec_result only contains {'feats'}
granularity="utterance": {'feats': [*768]}
granularity="frame": {feats: [T*768]}
python main.py
'''
from funasr import AutoModel
import json
from collections import OrderedDict
# Load the finetuned emotion recognition model
model = AutoModel(model="iic/emotion2vec_base_finetuned")
mapper = ["angry", "disgusted", "fearful", "happy", "neutral", "other", "sad", "surprised", "unknown"]
wav_file = f"audio.wav"
rec_result = model.generate(wav_file, granularity="utterance")
scores = rec_result[0]['scores']
# Prepare the result mapping with emotions and their probabilities
result = {emotion: float(prob) for emotion, prob in zip(mapper, scores)}
# Sort the result in descending order of probability
sorted_result = OrderedDict(sorted(result.items(), key=lambda item: item[1], reverse=True))
print(json.dumps(sorted_result, indent=4)) I didn't find any working example in the repo and had to play with it.
|
I second this - it would be great to understand the details required to make an ONNX - much appreciated @ddlBoJack if you can help us out ? |
Thank you for contributing the ONNX model of emotion2vec.
|
Cool I didn't knew that the input can be 16khz wav directly. I tried to convert the |
I've been working with the emotion2vec model and trying to convert it to ONNX format for deployment purposes. The current implementation is great for PyTorch users, but having ONNX support would enable broader deployment options.
I tried converting the model using torch.onnx.export with various approaches:
Direct conversion of the AutoModel
Creating a wrapper around the model components
Implementing custom forward passes
Main challenges encountered:
Dimension mismatches in the conv1d layers
Issues with the masking mechanism
Difficulties preserving the complete model architecture
Problems with tensor handling between components
Could you please provide guidance on the correct architecture for ONNX conversion Including an example of proper tensor dimensionality through the model? I have converted torch vision models to Onnx before, but the audio models seemed a bit complicated to me :/
thank you very much your work it works really nice!
also see:
modelscope/FunASR#1690
The text was updated successfully, but these errors were encountered: