Can vllm serving clients by using multiple model instances? #239

aoyulong · 2023-06-21T07:24:05Z

aoyulong
Jun 21, 2023

Based on the examples, vllm can launch a server with a single model instances. Can vllm serving clients by using multiple model instances? With multiple model instances, the sever will dispatch the requests to different instances to reduce the overhead.

zhuohan123 · 2023-06-21T10:47:16Z

zhuohan123
Jun 21, 2023
Maintainer

Right now vLLM is a serving engine for a single model. You can start multiple vLLM server replicas and use a custom load balancer (e.g., nginx load balancer). Also feel free to checkout FastChat and other multi-model frontends (e.g., aviary). vLLM can be a model worker of these libraries to support multi-replica serving.

1 reply

hex-plex May 26, 2024

This is just a example of load balancer that I use hope its useful for others

Load Balancer Gist

hughesadam87 · 2024-05-06T14:00:12Z

有人可以告诉我使用 vllm 服务多个（不同）模型的最佳做法是什么吗？

Ray serve shold have been the best practice, However it seems that ray serve framework and ray infrastration used in AsyncLLMEngine are conflicting and result in a No GPU Error.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Uh oh!

Can vllm serving clients by using multiple model instances? #239

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Can vllm serving clients by using multiple model instances? #239

Uh oh!

aoyulong Jun 21, 2023

Replies: 2 comments · 4 replies

Uh oh!

Uh oh!

zhuohan123 Jun 21, 2023 Maintainer

Uh oh!

hex-plex May 26, 2024

Uh oh!

hughesadam87 May 6, 2024

Uh oh!

dmarx May 30, 2024

Uh oh!

khurramkhalil Jul 12, 2024

Uh oh!

huiyeruzhou Mar 10, 2025

aoyulong
Jun 21, 2023

Replies: 2 comments 4 replies

zhuohan123
Jun 21, 2023
Maintainer

hughesadam87
May 6, 2024