feat: add llm-katan to k8s deploy #466

JaredforReal · 2025-10-17T14:35:47Z

What this PR does / why we need it:

add llm-katan to k8s deploy
separate core and llm-katan mode, as what we do in Docker Compose deploy
update README and docs
expand pvc size
fix inference-pool selector error

netlify · 2025-10-17T14:35:53Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`ee082b4`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/68f5deff9dc72700088a047b
😎 Deploy Preview	https://deploy-preview-466--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2025-10-17T14:44:00Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `deploy`

Owners: @rootfs, @Xunzhuo
Files changed:

deploy/kubernetes/base/config.yaml
deploy/kubernetes/base/deployment.yaml
deploy/kubernetes/base/kustomization.yaml
deploy/kubernetes/base/pv.yaml
deploy/kubernetes/base/tools_db.json
deploy/kubernetes/overlays/core/kustomization.yaml
deploy/kubernetes/overlays/llm-katan/kustomization.yaml
deploy/kubernetes/overlays/llm-katan/patch-llm-katan.yaml
deploy/kubernetes/overlays/storage/kustomization.yaml
deploy/kubernetes/overlays/storage/namespace.yaml
deploy/kubernetes/README.md
deploy/kubernetes/ai-gateway/inference-pool/inference-pool.yaml
deploy/kubernetes/kustomization.yaml
deploy/kubernetes/base/namespace.yaml
deploy/kubernetes/base/service.yaml
deploy/kubernetes/overlays/storage/pvc.yaml

📁 `website`

Owners: @Xunzhuo, @rootfs, @yuluo-yx
Files changed:

website/docs/installation/kubernetes.md
website/docs/troubleshooting/network-tips.md

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

yossiovadia

look solid for moving forward, but consider adding validation for the Qwen model download and possibly making the init container more robust in handling download failures, but it can be a future enhancement.

Signed-off-by: JaredforReal <[email protected]>

JaredforReal requested review from Xunzhuo and rootfs as code owners October 17, 2025 14:35

JaredforReal force-pushed the fix-k8s branch from 4faa35d to 320d9d9 Compare October 17, 2025 14:43

github-actions bot assigned rootfs and Xunzhuo Oct 17, 2025

JaredforReal marked this pull request as draft October 17, 2025 14:59

yossiovadia self-requested a review October 17, 2025 17:20

yossiovadia approved these changes Oct 17, 2025

View reviewed changes

JaredforReal added 8 commits October 20, 2025 15:04

expand pvc size & fix inference-pool selector error

67825a0

Signed-off-by: JaredforReal <[email protected]>

add llm-katan to k8s

b9d8cb3

Signed-off-by: JaredforReal <[email protected]>

seperate core and llm-katan

3316c40

Signed-off-by: JaredforReal <[email protected]>

update k8s install docs

a7191cf

Signed-off-by: JaredforReal <[email protected]>

try fix CI error

a245d5c

Signed-off-by: JaredforReal <[email protected]>

add init models fall back

f9d1346

Signed-off-by: JaredforReal <[email protected]>

get rig of redudent files

3d11e5f

Signed-off-by: JaredforReal <[email protected]>

add pvc to k8s & update docs

ee082b4

Signed-off-by: JaredforReal <[email protected]>

JaredforReal force-pushed the fix-k8s branch from 320d9d9 to ee082b4 Compare October 20, 2025 07:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add llm-katan to k8s deploy #466

feat: add llm-katan to k8s deploy #466

Uh oh!

JaredforReal commented Oct 17, 2025

Uh oh!

netlify bot commented Oct 17, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 17, 2025 •

edited

Loading

Uh oh!

yossiovadia left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: add llm-katan to k8s deploy #466

Are you sure you want to change the base?

feat: add llm-katan to k8s deploy #466

Uh oh!

Conversation

JaredforReal commented Oct 17, 2025

Uh oh!

netlify bot commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 deploy

📁 website

🎉 Thanks for your contributions!

Uh oh!

yossiovadia left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

netlify bot commented Oct 17, 2025 •

edited

Loading

github-actions bot commented Oct 17, 2025 •

edited

Loading

📁 `deploy`

📁 `website`