Allow formula based dynamic maxReplicas #5938

lkishalmi · 2024-07-02T01:45:42Z

Proposal

There are certain resource saturation situations which would could restrict scaling out.

While these situations sometimes can be incorporated into a formula in scalinModifiers, that could result a fairly complex one especially when we already have multiple triggers.

I'd suggest that the maxReplicaCount of the ScaledObject could be an int or string, if it would not parse an integer, then it would be evaluated as a formula. The trigger definitions could be used as a source for the calculation.

Use-Case

We have several worker processes, that read/writes data between multiple datastores. (Redis, MySQL).

We are processing queues, and would like to process them fast enough. However if all workers are doing jobs everywhere, than some of the systems could get saturated and eventually broke.

At the moment we have empirically set maximum values, that are trying to save our backend systems from saturation. Still that happen from time-to-time.

We have good metrics/indicators to detect saturation. We have an alerting system in place. When the alert happens we manually set the maximum number of replicas. Unfortunately the time between the saturation detection and the scale down is really crucial, we need to act in 2-5 minutes. Sometimes we could not act that fast.

It would be good to automate these evaluations and actions. We have already have our scalers in KEDA, it controls our HPAs.

The saturation metris/indicators could be collected with existing KEDA scalers (triggers).

Is this a feature you are interested in implementing yourself?

No

Anything else?

No response

The text was updated successfully, but these errors were encountered:

JorTurFer · 2024-07-29T11:23:34Z

Hello,
Currently, you can do it using scalingModifiers. If you include something like min(your_max_value, max(scalers)) you can limit the max value using the formula that you want to calculate the max value

lkishalmi · 2024-07-29T15:12:32Z

Yes. Though unfortunately scalingModifiers do not work with traditional cpu or memory triggers.

JorTurFer · 2024-07-29T15:20:57Z

Interesting point!
Let see other @kedacore/keda-contributors thoughts

lkishalmi · 2024-07-29T20:56:43Z

Also, AFAIK changing the max value is kind of stronger rule, than an utilization based metric. Usually when we detect some kind of saturation in the system, we do not really care about Scale up/down policies/stabilization windows etc. The auto-scaler just shall drop some replicas if they were over the new limit.

joebowbeer · 2024-08-05T03:19:50Z

Reaching maxReplicaCount is often treated as an alarm-worthy event.

Dynamically changing maxReplicaCount in order to scale out would make the alarm worthless.

lkishalmi · 2024-08-22T21:40:31Z

Well, for those who have workloads where reaching the maxReplicaCount could be alarming, probably should not use dynamically changing maxReplicaCount.

We have a bunch of queue processing workloads working against 2-3 centralized backend systems. it's reallyhard to determine the correct max replicas / deployment across all the system. We are working with guesstimates, though we have good indicators when some central systems start to be saturated. Dropping some not that important workload would serve us well.

JorTurFer · 2024-09-02T20:41:05Z

@zroubalik @tomkerkhove ?

tomkerkhove · 2024-09-03T06:35:28Z

Reaching maxReplicaCount is often treated as an alarm-worthy event.

Dynamically changing maxReplicaCount in order to scale out would make the alarm worthless.

If you choose to dynamically define them, then I think it's safe to say you opt in for the behavior and should be OK.

joebowbeer · 2024-09-03T22:20:17Z

I suspect that HPA is not designed to be used in this way.

What I do see is that a lot of code is bypassed when current replicas exceeds max replicas. This code implements the customizable behaviors, including the default behavior, which may result in erratic scaling:

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#default-behavior

https://github.com/kubernetes/kubernetes/blob/HEAD/pkg/controller/podautoscaler/horizontal.go#L822

lkishalmi · 2024-09-04T21:22:48Z

Well, the code bypass would be a kind of desired behavior. I do not think anyone would miss scale down behaviors modifiers in those cases. In our workload have the following use cases:

We have a weekly archival process, that affects the overall throughput of our system, so during those times we could cap other interfering workflows way lower, than on other days. That would be a cron based trigger.
We are measuring the saturation of our core shared resources. If we have a trigger on resource saturation. The desired event would be to drop non-critical workloads.

lkishalmi added feature-request All issues for new features that have not been committed to needs-discussion labels Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow formula based dynamic maxReplicas #5938

Allow formula based dynamic maxReplicas #5938

lkishalmi commented Jul 2, 2024

JorTurFer commented Jul 29, 2024

lkishalmi commented Jul 29, 2024

JorTurFer commented Jul 29, 2024

lkishalmi commented Jul 29, 2024

joebowbeer commented Aug 5, 2024

lkishalmi commented Aug 22, 2024

JorTurFer commented Sep 2, 2024

tomkerkhove commented Sep 3, 2024

joebowbeer commented Sep 3, 2024 •

edited

Loading

lkishalmi commented Sep 4, 2024

Allow formula based dynamic maxReplicas #5938

Allow formula based dynamic maxReplicas #5938

Comments

lkishalmi commented Jul 2, 2024

Proposal

Use-Case

Is this a feature you are interested in implementing yourself?

Anything else?

JorTurFer commented Jul 29, 2024

lkishalmi commented Jul 29, 2024

JorTurFer commented Jul 29, 2024

lkishalmi commented Jul 29, 2024

joebowbeer commented Aug 5, 2024

lkishalmi commented Aug 22, 2024

JorTurFer commented Sep 2, 2024

tomkerkhove commented Sep 3, 2024

joebowbeer commented Sep 3, 2024 • edited Loading

lkishalmi commented Sep 4, 2024

joebowbeer commented Sep 3, 2024 •

edited

Loading