Skip to content

Conversation

X1aoZEOuO
Copy link
Contributor

@X1aoZEOuO X1aoZEOuO commented Sep 28, 2025

What this PR does / why we need it

In this update, several key improvements were made to support serverless operations and model activation. New constants were introduced to manage model activation states and cache information effectively.

Environment variables like POD_IP were added to dynamically configure networking settings, enhancing deployment flexibility. The main function was updated to include flags for enabling serverless features and configuring pod IPs, ensuring controllers can handle these operations smoothly.

RBAC rules were expanded to allow more comprehensive resource management, including patching and updating endpoints. A new controller, ActivatorReconciler, was implemented to manage model activation, service reconciliation, and traffic forwarding, crucial for serverless activations.

Lastly, service creation logic was updated to include model annotations, ensuring services are correctly configured for activation purposes. These changes collectively improve the system's ability to manage dynamic and scalable deployments.

Which issue(s) this PR fixes

Fixes #362

Special notes for your reviewer

Does this PR introduce a user-facing change?


cc @pacoxu @kerthcet

@InftyAI-Agent InftyAI-Agent added needs-triage Indicates an issue or PR lacks a label and requires one. needs-priority Indicates a PR lacks a label and requires one. do-not-merge/needs-kind Indicates a PR lacks a label and requires one. labels Sep 28, 2025
@X1aoZEOuO X1aoZEOuO force-pushed the feat/0-1-activator branch 2 times, most recently from 5424dec to a7ae00f Compare September 28, 2025 12:19
@X1aoZEOuO
Copy link
Contributor Author

/kind feature

@InftyAI-Agent InftyAI-Agent added feature Categorizes issue or PR as related to a new feature. and removed do-not-merge/needs-kind Indicates a PR lacks a label and requires one. labels Sep 28, 2025
@X1aoZEOuO X1aoZEOuO force-pushed the feat/0-1-activator branch 2 times, most recently from 5d23e51 to 0a1d0fe Compare September 28, 2025 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Categorizes issue or PR as related to a new feature. needs-priority Indicates a PR lacks a label and requires one. needs-triage Indicates an issue or PR lacks a label and requires one.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[OSPP] KEDA-based Serverless Elastic Scaling for llmaz
2 participants