Skip to content

Conversation

Jrmy2402
Copy link

@Jrmy2402 Jrmy2402 commented Aug 19, 2025

What type of PR is this?
/kind documentation
/kind feature
/kind api-change

What this PR does / why we need it:
AEP for #8420


@k8s-ci-robot k8s-ci-robot added kind/documentation Categorizes issue or PR as related to documentation. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 19, 2025
Copy link

linux-foundation-easycla bot commented Aug 19, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. do-not-merge/needs-area needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 19, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @Jrmy2402. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Jrmy2402
Once this PR has been reviewed and has the lgtm label, please assign towca for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added area/vertical-pod-autoscaler size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed do-not-merge/needs-area labels Aug 19, 2025
@Jrmy2402 Jrmy2402 force-pushed the AEP-cpu-memory-ratio branch from d7748f1 to 4ca0ab5 Compare August 19, 2025 15:16
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Aug 19, 2025
@Jrmy2402 Jrmy2402 force-pushed the AEP-cpu-memory-ratio branch from 4ca0ab5 to 61e4f32 Compare August 19, 2025 15:17
@adrianmoisey
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 19, 2025
@k8s-triage-robot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

Copy link
Member

@omerap12 omerap12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! some notes from me

Copy link
Member

@omerap12 omerap12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM with minor comments

@omerap12
Copy link
Member

/retitle AEP-8459: MemoryPerCPU Enforce

@k8s-ci-robot k8s-ci-robot changed the title docs: add AEP for MemoryPerCPU feature AEP-8459: MemoryPerCPU Enforce Aug 22, 2025
@Jrmy2402 Jrmy2402 force-pushed the AEP-cpu-memory-ratio branch from c5bbed0 to f4b20b6 Compare August 22, 2025 12:51
Copy link
Member

@omerap12 omerap12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good we just need to address this https://github.com/kubernetes/autoscaler/pull/8459/files#r2295957092

@omerap12
Copy link
Member

/label api-review

@k8s-ci-robot k8s-ci-robot added the api-review Categorizes an issue or PR as actively needing an API review. label Aug 31, 2025

### Behavior

* If both CPU and memory are controlled, VPA enforces the ratio.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if both (cpu and memory) are not specified? Should that be a validation error? It seems, like we should enforce that if you specify both you should get an error, this way we'll ensure that either you specify all the pieces of the puzzle, or none.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, my thinking was to simply ignore memoryPerCPU if either CPU or memory was not specified in controlledResources.

But if the philosophy is rather to fail fast and return a validation error whenever memoryPerCPU is set without both CPU and memory being present, I’m fine with that approach too, I can update the AEP accordingly.

* Applies to Target, LowerBound, UpperBound, and UncappedTarget.
* Ratio enforcement is strict:
* If the memory recommendation would exceed `cpu * memoryPerCPU`, then **CPU is increased** to satisfy the ratio.
* If the CPU recommendation would exceed `memory / memoryPerCPU`, then **memory is increased** to satisfy the ratio.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to say we should error out if the math doesn't stand with the cpu and memory values, adjusting seems "magical", and I'd advice against it. Explicitness is always better.

Copy link
Author

@Jrmy2402 Jrmy2402 Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point, implicit adjustments can indeed feel “magical.”
In this case, though, the whole purpose of the feature is to enforce the ratio automatically: if CPU or memory drifts away from the configured ratio, VPA brings them back in line.

If we only validated and errored, users wouldn’t get the behavior they’re asking for (“always keep memory = cpu × memoryPerCPU”), they’d just see failures.
That would make the feature much less useful in practice.

Or maybe I didn’t fully understand your point?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're talking about two distinct things 😅 I was more asking about the validation case, where we ensure that the provided memory and cpu ensure we can reach the configured memoryPerCPU. Iow. if I specify cpu=1, memory=4Gi and use memoryPerCPU=5 then the math won't work and that should fail validation, but cpu=1, memory=4Gi and memoryPerCPU=3 will work, b/c that's achievable.

Whereas you're talking about the actual enforcement, which indeed will be "magical" 😉, and that's totally fine.

Does that make sense?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, we were talking about two different layers.

I’ve pushed a commit to clarify the validation side: 93d9437

@Jrmy2402
Copy link
Author

Hi, @soltysh,

Did you check my answers to your comments?

Thanks in advance

@Jrmy2402
Copy link
Author

Jrmy2402 commented Oct 7, 2025

Hi,

@omerap12, @adrianmoisey do you think it could be merged as it is today, or do you have some comments?

Thanks in advance

@omerap12
Copy link
Member

omerap12 commented Oct 7, 2025

Hi,

@omerap12, @adrianmoisey do you think it could be merged as it is today, or do you have some comments?

Thanks in advance

We can't merge this until a Kubernetes api reviewer lgtm it.
We will ping them :)

@jackfrancis
Copy link
Contributor

/release-note-none

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Oct 7, 2025
@jackfrancis
Copy link
Contributor

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Oct 7, 2025
Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From API perspective this seems good, my comments are non-blocking, but mostly clarifying.

* Applies to Target, LowerBound, UpperBound, and UncappedTarget.
* Ratio enforcement is strict:
* If the memory recommendation would exceed `cpu * memoryPerCPU`, then **CPU is increased** to satisfy the ratio.
* If the CPU recommendation would exceed `memory / memoryPerCPU`, then **memory is increased** to satisfy the ratio.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're talking about two distinct things 😅 I was more asking about the validation case, where we ensure that the provided memory and cpu ensure we can reach the configured memoryPerCPU. Iow. if I specify cpu=1, memory=4Gi and use memoryPerCPU=5 then the math won't work and that should fail validation, but cpu=1, memory=4Gi and memoryPerCPU=3 will work, b/c that's achievable.

Whereas you're talking about the actual enforcement, which indeed will be "magical" 😉, and that's totally fine.

Does that make sense?

@Jrmy2402
Copy link
Author

Thanks @soltysh, @omerap12, @adrianmoisey, @jackfrancis for the review.
What is the next step to merge the PR ?

@omerap12
Copy link
Member

Thanks! nothing blocking from my end.
/lgtm
/assign @adrianmoisey

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 15, 2025
Comment on lines +113 to +116
* Example 3: `memoryPerCPU = 4Gi`, with `--container-recommendation-max-allowed-memory=7Gi` or with `maxAllowed.memory=6Gi` set in VPA object
* Baseline recommendation: 2 CPUs, 4Gi memory
* UncappedTarget (ratio enforced): 2 CPUs, 8Gi
* Target (capped): 2 CPUs, 7Gi ← memory capped by max-allowed-memory; ratio not fully satisfied
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure if this is a typo or if I'm not understanding something, but surely the Target on this one would be 2 CPUs, 6Gi memory, since maxAllowed.memory is 6Gi?

Comment on lines +146 to +147
* Admission ensures the provided `CPU` and `memory` bounds make the configured `memoryPerCPU` reachable—there must exist at least one (`cpu`, `memory`) such that `memory = cpu × memoryPerCPU`, otherwise the object is rejected.
* Example: with cpu=1 and memory=4Gi, memoryPerCPU=5Gi is invalid (1×5Gi > 4Gi), while memoryPerCPU=3Gi is valid (1×3Gi ≤ 4Gi).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is cpu and memory referring to here? Is this on Pod admission or is this the minAllowed/maxAllowed in the VPA spec?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-review Categorizes an issue or PR as actively needing an API review. area/vertical-pod-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/documentation Categorizes issue or PR as related to documentation. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

7 participants