Share: vLLM Support for Fun-ASR (Fun-ASR-vllm)

Hi,
Thank you for your excellent work on this project!
I wanted to share that I have implemented vLLM support for Fun-ASR inference, which currently achieves an approximately 50% speed increase. I created a repository for this implementation here:

https://github.com/yuekaizhang/Fun-ASR-vllm

Benchmark Details
Dataset: SPEECHIO_ASR_ZH00007 (approx. 1 hour of audio)
Hardware: Single NVIDIA H20 GPU

| Mode | Decoding Time | RTF | RTFx | CER | Note |
|------|---------------|-----|------|-----|------|
| Huggingface PyTorch | 218.2 Secs | 0.06 | 16.5 | 7.02% | batch_size=1 |
| **vLLM (Qwen3-0.6B)** | **145.6 Secs** | **0.04** | **24.7** | **6.99%** | batch_size=1 |

I plan to continue optimizing the performance in the future. Any feedback or attention would be greatly appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Share: vLLM Support for Fun-ASR (Fun-ASR-vllm) #34

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mode	Decoding Time	RTF	RTFx	CER	Note
Huggingface PyTorch	218.2 Secs	0.06	16.5	7.02%	batch_size=1
vLLM (Qwen3-0.6B)	145.6 Secs	0.04	24.7	6.99%	batch_size=1

Share: vLLM Support for Fun-ASR (Fun-ASR-vllm) #34

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions