Skip to content

Conversation

@googs1025
Copy link
Contributor

  • docs: add P/D disaggregation example in manifests/disaggregation

fix: #251 (comment)

metadata:
name: vllm-sim-p
labels:
app: my-llm-pool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the "my-llm-pool" label is required? when the simulator is installed as part of llm-d - it defines the innference-pool, but here this is a stand-alone example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed, only use llm-d.ai/role label

Now, send a request to the forwarded Decode service port with the necessary headers:

```bash
curl -v http://localhost:8001/v1/chat/completions \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

port should be 8000

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, thanks for reminding

spec:
containers:
- name: routing-sidecar
image: ghcr.io/llm-d/llm-d-routing-sidecar:latest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no latest tag for side-car

Copy link
Contributor Author

@googs1025 googs1025 Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's also where I have some questions. When I was testing, I used ghcr.io/llm-d/llm-d-routing-sidecar:v0.3.1-rc.1, which I referenced here: https://github.com/llm-d/llm-d-routing-sidecar/releases.

However, I see "latest" being used here, which I assumed was due to network issues I can't downloads.

https://github.com/llm-d/llm-d-inference-scheduler/blob/091f07e7809160bc062be2199500f2675e9cf734/deploy/components/vllm-sim-pd/deployments.yaml#L74

@googs1025 googs1025 force-pushed the feature/pd_example branch 2 times, most recently from a1137a6 to 842d281 Compare November 10, 2025 01:28
@mayabar
Copy link
Collaborator

mayabar commented Nov 10, 2025

@googs1025 I tested the configuration from the scheduler and turned out that there is a new version of nixl connector and simulator should be changed to by in-sync with it.
Created an issue for this #255

@googs1025
Copy link
Contributor Author

@googs1025 I tested the configuration from the scheduler and turned out that there is a new version of nixl connector and simulator should be changed to by in-sync with it. Created an issue for this #255

change to - "--connector=nixlv2"

@mayabar
Copy link
Collaborator

mayabar commented Nov 11, 2025

Changing the connector type to nixlv2 is not enough, PD will work end-to-end after #255 fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Does vLLM-Simulator support Prefill/Decode separation simulation?

2 participants