Skip to content

Conversation

X1aoZEOuO
Copy link
Contributor

What this PR does / why we need it

This commit introduces a comprehensive guide for configuring serverless environments on Kubernetes, with a focus on integrating Prometheus for monitoring and KEDA for autoscaling. The guide aims to optimize resource efficiency through event-driven scaling while maintaining observability and resilience for AI/ML workloads and other latency-sensitive applications.

This commit adds a detailed guide for configuring serverless environments on Kubernetes, integrating Prometheus for monitoring and KEDA for autoscaling. The guide includes YAML configurations, step-by-step installation instructions, and performance benchmarks to help users achieve optimal resource efficiency and observability for their applications.

Which issue(s) this PR fixes

Fixes #362

Special notes for your reviewer

Does this PR introduce a user-facing change?


cc @pacoxu @kerthcet

@InftyAI-Agent InftyAI-Agent added needs-triage Indicates an issue or PR lacks a label and requires one. needs-priority Indicates a PR lacks a label and requires one. do-not-merge/needs-kind Indicates a PR lacks a label and requires one. labels Sep 28, 2025
@X1aoZEOuO
Copy link
Contributor Author

/kind feature

@InftyAI-Agent InftyAI-Agent added feature Categorizes issue or PR as related to a new feature. and removed do-not-merge/needs-kind Indicates a PR lacks a label and requires one. labels Sep 28, 2025
@X1aoZEOuO
Copy link
Contributor Author

/kind documentation

@InftyAI-Agent InftyAI-Agent added the documentation Categorizes issue or PR as related to documentation. label Sep 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Categorizes issue or PR as related to documentation. feature Categorizes issue or PR as related to a new feature. needs-priority Indicates a PR lacks a label and requires one. needs-triage Indicates an issue or PR lacks a label and requires one.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[OSPP] KEDA-based Serverless Elastic Scaling for llmaz
2 participants