Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.3.0 roadmap #698

Open
3 tasks
Jeffwan opened this issue Feb 18, 2025 · 8 comments
Open
3 tasks

v0.3.0 roadmap #698

Jeffwan opened this issue Feb 18, 2025 · 8 comments
Labels
kind/enhancement New feature or request kind/feature Categorizes issue or PR as related to a new feature. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Milestone

Comments

@Jeffwan
Copy link
Collaborator

Jeffwan commented Feb 18, 2025

🚀 Feature Description and Motivation

I create this issue to track the v0.3.0 items we like to work on. We actually have a milestone https://github.com/aibrix/aibrix/milestone/9 to track all issues but that's too many issues that user who does not work on this project might feel overwhelmed.

Let's create a list that user are interested in.

Use Case

Track the v0.3.0 release items

Proposed Solution

No response

@Jeffwan Jeffwan pinned this issue Feb 18, 2025
@Jeffwan Jeffwan added kind/enhancement New feature or request priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. kind/feature Categorizes issue or PR as related to a new feature. labels Feb 18, 2025
@kerthcet
Copy link
Collaborator

I'm wondering whether we can deliver a stable version at some time, stable here means a workable state, less bugs, relatively complete documentation, good test coverages. We can make it a baseline and append more features on it with feature gates or flags to enable/disable it. I have this question is just because I saw we have a lot of inspiring features to be merged, have no idea what's the plan to evolve with them in the long term.

@Jeffwan
Copy link
Collaborator Author

Jeffwan commented Feb 18, 2025

@kerthcet I agree that after v0.2.0, we will have a solid baseline of features, and ensuring production-grade quality should be our top priority. We can discuss this further and align on the next steps as you suggested. The future roadmap should balance new feature development with production readiness to maintain stability while continuing to evolve. We have some internal adoptions as well, we will try to surface those bugs or tricks at the same time.

@gaocegege
Copy link
Collaborator

Are you planning to support [Feature]: Support Ray-free multi-node distributed inference on resource managers like Kubernetes to simplify the deployment of multi-node inference? I recently discussed this with youkaichao@, and he believes it could be possible by implementing a new executor.

Some references: vllm-project/vllm#11400

@kerthcet
Copy link
Collaborator

According to the offline talk with @Jeffwan before, I think aibrix leverages ray for fine-gained orchestration, like multi host serving, pd disaggregated serving, so maybe not a plane in the long term? Need @Jeffwan 's confirm. But definitely possible for vllm project.

@robertgshaw2-redhat
Copy link

Congrats on the launch guys!

@Electronic-Waste
Copy link

@Jeffwan Awesome! Congrats on the open-source!

@Jeffwan
Copy link
Collaborator Author

Jeffwan commented Feb 24, 2025

@gaocegege @kerthcet

We do see lots of users do not like ray in distributed serving due to the its overhead and debug-ability. Supporting cloud native way to run vLLM in multi-nodes would be beneficial. I think options should be given to users. We created vllm-project/vllm#3902 earlier but didn't get chance to works on it, if people likes it and there's no one working on it yet, we will spend some efforts and also change to orchestration layer.

BTW, P&D case orchestration will introduce the application router or local cluster scheduler (CLS in splitwise paper), it's not exact same as current multi-node way, if the paradigm can be finalized, the cloud native way sounds like a plan. If not, I think it still a potential problem because everytime the paradigm is changed, cloud native way need additional change.

@Jeffwan Jeffwan added this to the v0.3.0 milestone Feb 24, 2025
@gaocegege
Copy link
Collaborator

BTW, P&D case orchestration will introduce the application router or local cluster scheduler (CLS in splitwise paper), it's not exact same as current multi-node way, if the paradigm can be finalized, the cloud native way sounds like a plan. If not, I think it still a potential problem because everytime the paradigm is changed, cloud native way need additional change.

We had some discussions in production stack about it too. vllm-project/production-stack#7 (comment) .

/cc @KuntaiDu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement New feature or request kind/feature Categorizes issue or PR as related to a new feature. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

No branches or pull requests

5 participants