-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ONNX export for Pytorch Model #93
Comments
Hi HuguesGallier, The STFT or iSTFT operations can be performed externally (you need to remove the STFT and iSTFT computations inside the spectral gating code), or you can implement the STFT operation as a nn.module using conv1d and precomputed Fourier basis and integrate it with the spectral gate (see this issue: pytorch/pytorch#31317). Since it is scheduled to be supported in the next op set, we think it is unnecessary to add it to noisereduce. By the way, if you are only running the spectral gating on 2 seconds of audio, it may not be enough, as it expects both noise and speech to be in the same recording. I suggest that you capture the noise profile externally and pass it to the y_noise argument. We may add the ability to continuously stream noise statistics to the streamer function in the future. I hope this is helpful! |
Feel free to use this repo to export your custom STFT or ISTFT process to ONNX format. There’s no need to separate the STFT and ISTFT from the model anymore. |
I am trying to export the quartznet model from NeMo. During the preprocessing of the audio signal, torch.stft() is used [line 437 in source code], which causes errors in exporting the preprocessor class to onnx. I am looking for a replacement for the torch.stft() function. When I pass a dummy audio signal as input to the preprocessor, torch.stft() outputs a tensor of shape torch.Size([1, 257, 61]), and your package outputs a real part and an imaginary part of shape torch.Size([257, 61]) torch.Size([257, 61]). The NeMo code then passes the torch.stft() output to torch.view_as_real(). I'm not sure how to pass the real and imaginary parts output by your package to this function. Another question, since your package used conv1d layers, I assume during the model training, these layers get updated as well. Since these layers are being initialized randomly, wouldn't it affect the outputs? I'm using your package only for inference. |
You can use
In random tests, this custom STFT/ISTFT shows almost no difference compared to |
Thanks. Do I need to train it? I'm using a pretrained model for inference only. |
No, you don't need to train it. |
Hi @DakeQQ , I successfully exported the preprocessor to onnx by replacing the torch.stft() with your custom implementation. However when I load it in onnxruntime web and try to run inference, I get the following error -
I tried loading the same onnx model in python like so -
Here I get the output as expected -
Why is the inference working in python, but not in web? |
@kabyanil |
Hello,
I have been using the Pytorch model on a raspberry Pi. I am running it on a 2 seconds audio to detect a Wakeword every 200ms (see this issue regarding why I am not running it on the independents 200ms chunks). It takes between 10ms and 40ms to run it.
The performance are still good, but could be improved with ONNX. I have therefore tried to export the model to ONNX (code below), and got several errors (below too).
This is not an issue with this repository. Simply, Pytorch does not support exporting neither
istft
norstft
to ONNX. See this issue that tracks it down.Nonetheless, on our end, we could maybe use directly ONNX STFT. For the
istft
, it seems that they very recently are thinking about adding it (see this issue), but that is is still not here.What are your thought on this?
Note: once a model is exported to ONNX, the parameters cannot be changed as far as I know. So, probably, a great thing to do here would be to allow the export with a
to_onnx
method on an instantiatedTorchGate
object. If we find a solution for thisistft
andstft
issue, I'd be willing to make a PR for it :)Note 2: From here and here it seems that we might just have to wait until
torch.onnx
supports opset 19, which should contain the other operators... Not sure thoughAnnex:
Code to export to ONNX (to be put here):
Error:
The text was updated successfully, but these errors were encountered: