-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batched inference on multiple audios #1177
Comments
cc: @MahmoudAshraf97 maybe - would be great to get another set of eyes on the eval scripts. |
unfortunately it doesn't yet, the batching we support is on a single file that is segmented using VAD and AFAIK, OpenASR leaderboard discourages the usage of VAD. |
Hi @MahmoudAshraf97 - thanks for the response. do you have a sample code for how this would look like with transformers? or can you please point us to the right documentation? 🙏 |
You'll have to replace This with faster-whisper/faster_whisper/transcribe.py Lines 223 to 237 in 8327d8c
Just some quick tips:
|
Hello,
We are currently looking to add faster-whisper into the Open ASR Leaderboard.
Here is the script we are using to run the evals: https://github.com/huggingface/open_asr_leaderboard/tree/main/ctranslate2
We noticed that I does not natively support batched inference on multiple audios which makes the RTFx significantly lower that the orignal whisper which are evaluated with a batch_size 64.
Is there something that we are doing wrong ?
Thanks
The text was updated successfully, but these errors were encountered: