-
Notifications
You must be signed in to change notification settings - Fork 37
docs: add P/D disaggregation example in manifests/disaggregation #253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
6ce81a9 to
c9364a8
Compare
| metadata: | ||
| name: vllm-sim-p | ||
| labels: | ||
| app: my-llm-pool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the "my-llm-pool" label is required? when the simulator is installed as part of llm-d - it defines the innference-pool, but here this is a stand-alone example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed, only use llm-d.ai/role label
manifests/disaggregation/README.md
Outdated
| Now, send a request to the forwarded Decode service port with the necessary headers: | ||
|
|
||
| ```bash | ||
| curl -v http://localhost:8001/v1/chat/completions \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
port should be 8000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, thanks for reminding
| spec: | ||
| containers: | ||
| - name: routing-sidecar | ||
| image: ghcr.io/llm-d/llm-d-routing-sidecar:latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no latest tag for side-car
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's also where I have some questions. When I was testing, I used ghcr.io/llm-d/llm-d-routing-sidecar:v0.3.1-rc.1, which I referenced here: https://github.com/llm-d/llm-d-routing-sidecar/releases.
However, I see "latest" being used here, which I assumed was due to network issues I can't downloads.
a1137a6 to
842d281
Compare
|
@googs1025 I tested the configuration from the scheduler and turned out that there is a new version of nixl connector and simulator should be changed to by in-sync with it. |
Signed-off-by: googs1025 <[email protected]>
842d281 to
3c49142
Compare
change to |
|
Changing the connector type to nixlv2 is not enough, PD will work end-to-end after #255 fix. |
fix: #251 (comment)