-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Description
Hi,
Thank you for your excellent work on this project!
I wanted to share that I have implemented vLLM support for Fun-ASR inference, which currently achieves an approximately 50% speed increase. I created a repository for this implementation here:
https://github.com/yuekaizhang/Fun-ASR-vllm
Benchmark Details
Dataset: SPEECHIO_ASR_ZH00007 (approx. 1 hour of audio)
Hardware: Single NVIDIA H20 GPU
| Mode | Decoding Time | RTF | RTFx | CER | Note |
|---|---|---|---|---|---|
| Huggingface PyTorch | 218.2 Secs | 0.06 | 16.5 | 7.02% | batch_size=1 |
| vLLM (Qwen3-0.6B) | 145.6 Secs | 0.04 | 24.7 | 6.99% | batch_size=1 |
I plan to continue optimizing the performance in the future. Any feedback or attention would be greatly appreciated!
harryfyodor, chenhaitao2026, Porridge144, zyjcsf, penfree and 3 more
Metadata
Metadata
Assignees
Labels
No labels