-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow formula based dynamic maxReplicas #5938
Comments
Hello, |
Yes. Though unfortunately |
Interesting point! |
Also, AFAIK changing the max value is kind of stronger rule, than an utilization based metric. Usually when we detect some kind of saturation in the system, we do not really care about Scale up/down policies/stabilization windows etc. The auto-scaler just shall drop some replicas if they were over the new limit. |
Reaching maxReplicaCount is often treated as an alarm-worthy event. Dynamically changing maxReplicaCount in order to scale out would make the alarm worthless. |
Well, for those who have workloads where reaching the We have a bunch of queue processing workloads working against 2-3 centralized backend systems. it's reallyhard to determine the correct max replicas / deployment across all the system. We are working with guesstimates, though we have good indicators when some central systems start to be saturated. Dropping some not that important workload would serve us well. |
If you choose to dynamically define them, then I think it's safe to say you opt in for the behavior and should be OK. |
I suspect that HPA is not designed to be used in this way. What I do see is that a lot of code is bypassed when current replicas exceeds max replicas. This code implements the customizable behaviors, including the default behavior, which may result in erratic scaling: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#default-behavior https://github.com/kubernetes/kubernetes/blob/HEAD/pkg/controller/podautoscaler/horizontal.go#L822 |
Well, the code bypass would be a kind of desired behavior. I do not think anyone would miss scale down behaviors modifiers in those cases. In our workload have the following use cases:
|
Proposal
There are certain resource saturation situations which would could restrict scaling out.
While these situations sometimes can be incorporated into a
formula
inscalinModifiers
, that could result a fairly complex one especially when we already have multiple triggers.I'd suggest that the
maxReplicaCount
of theScaledObject
could be an int or string, if it would not parse an integer, then it would be evaluated as a formula. The trigger definitions could be used as a source for the calculation.Use-Case
We have several worker processes, that read/writes data between multiple datastores. (Redis, MySQL).
We are processing queues, and would like to process them fast enough. However if all workers are doing jobs everywhere, than some of the systems could get saturated and eventually broke.
At the moment we have empirically set maximum values, that are trying to save our backend systems from saturation. Still that happen from time-to-time.
We have good metrics/indicators to detect saturation. We have an alerting system in place. When the alert happens we manually set the maximum number of replicas. Unfortunately the time between the saturation detection and the scale down is really crucial, we need to act in 2-5 minutes. Sometimes we could not act that fast.
It would be good to automate these evaluations and actions. We have already have our scalers in KEDA, it controls our HPAs.
The saturation metris/indicators could be collected with existing KEDA scalers (triggers).
Is this a feature you are interested in implementing yourself?
No
Anything else?
No response
The text was updated successfully, but these errors were encountered: