Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-2625: Promote the CPUManager Policy Option full-pcpus-only to GA #5108

Merged

Conversation

ffromani
Copy link
Contributor

@ffromani ffromani commented Jan 30, 2025

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 30, 2025
@k8s-ci-robot k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Jan 30, 2025
@ffromani
Copy link
Contributor Author

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 30, 2025
cosmetic changes, no changes in content besides the bare minimum
neededl to adjust to the new template.

Signed-off-by: Francesco Romani <[email protected]>
catch up with the updates since beta graduation

Signed-off-by: Francesco Romani <[email protected]>
@ffromani ffromani force-pushed the cpumanager-policy-options-to-ga branch 2 times, most recently from 5bde288 to eb7974c Compare January 30, 2025 12:16
@kannon92
Copy link
Contributor

/retitle KEP-2625: Promote CPUManager Policy Options to GA

@k8s-ci-robot k8s-ci-robot changed the title KEP-2625: Update CPU Manager Policy Options 1.33 GA KEP-2625: Promote CPUManager Policy Options to GA Jan 31, 2025

For all these reasons we postponed this work to a later date.
TBD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold

We should probably have this filled out before merging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold cancel

done

@@ -286,7 +250,7 @@ NOTE: Even though the feature gate is enabled by default the user still has to e
The alpha-quality options are hidden by default and only if the `CPUManagerPolicyAlphaOptions` feature gate is enabled the user has the ability to use them.
The beta-quality options are visible by default, and the feature gate allows a positive acknowledgement that non stable features are being used, and also allows to optionally turn them off.
Based on the graduation criteria described below, a policy option will graduate from a group to the other (alpha to beta).
We plan to removete the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` after all options graduated to stable, after a feature cycle passes without new planned options, and not before 1.28, to give ample time to the work in progress option to graduate at least to beta.
We plan to remove the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` after all options graduated to stable, after a feature cycle passes without new planned options, and not before 1.28, to give ample time to the work in progress option to graduate at least to beta.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like we won't ever remove CPUManagerPolicyAlphaOptions and CPUManagerPolicyBetaOptions.

We have uncorecache that is now an alpha policy option.

Is our plan really to remove this and then add it back when people want a new CPUManagerPolicy?

Or do we want to have dedicated feature gates for each policy option now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, I need to rephrase to convey the meaning here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still have feature gates guarding groups of options (and totally we have a bunch of alpha options, not just uncorecache). Arguably the removal strategy of these FGs is a bit unspecified. We didn't touch this subject much after the KEP was first discussed. I'm open to suggestions here.

Copy link
Contributor

@swatisehgal swatisehgal Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we didn't explicitly discuss the removal strategy for feature gates. Removing a feature gate after a single cycle without new policy options seems a bit aggressive, but keeping feature gates indefinitely when they no longer guard any options isn’t ideal either.

Once all policy options have graduated to GA, it might be reasonable to wait at least two cycles before removing the feature gates if no new options are planned? Let's have this discussion during the PRR review of this PR.

Additionally, as new policy options become less frequent, we should revisit whether to reintroduce CPUManagerPolicyAlphaOptions and CPUManagerPolicyBetaOptions when new options are proposed after feature gates have been removed. This would help avoid feature gate proliferation - the main reason we moved away from a feature gate per policy option or determine if we should revert to the standard approach of gating each policy option individually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this alternative, but the problem here is which KEP is supposed to track this work.
This KEP wanted to introduce a single new option, and we added the CPUManagerPolicyAlphaOptions and CPUManagerPolicyBetaOptions along the way, as outcome of the conversation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. All policy option KEPs already enumerate the feature gates that guard them. If that is not the case, it should be rectified. So, when all options have graduated to GA, the responsibility of cleaning up these feature gates should fall on the last policy option reaching GA.

Based on our discussion in SIG Node yesterday, it would make sense to couple this with other cleanup work to be done after GA graduation. The recommended timeline here is three releases, so IMO we should follow the same approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree. Either that or a new minimal KEP to remove the gates once we reached idle time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once PRR team has had a chance to weigh in on this, I would capture this discussion in the KEP or in the issue so that the plan is properly documented for future reference and doesn't get lost in github comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. Once we are done with most options we can retire the gates. If later we add an option here or there, we can probably give them their own gates (or not, we can decide then).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, further rephrased to capture this conversation.


N/A.

###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@ffromani ffromani Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean, but at the same time this is what the template is suggesting https://github.com/kubernetes/enhancements/blob/master/keps/NNNN-kep-template/README.md#what-are-the-slis-service-level-indicators-an-operator-can-use-to-determine-the-health-of-the-service and I followed that. What else should be set here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the "errors" here system or user error? If they are system errors, then the error rate (errors/requests) is an SLI. If they are user errors, it's not really. Latency for configuring this could be another SLI, but it is encompassed by the general Pod provisioning SLI and I don't think we need to track this separately.

Since there are no SLOs the SLIs don't mean much. But if this is a system error, we probably should use it to specify an SLO as well.

I am OK leaving these in here for documentation, regardless (or they could move to the troubleshooting section)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpu_manager_pinning_errors_total are system errors. Meaning: the system was asked to pin CPUs because the workloads requests them, but failed to do so. cpu_manager_pinning_requests_total tracks the number of workloads (containers) managed by the kubelet which requested cpu pinning (the usual cpumanager requirement: guaranteed QoS pods, integral CPU requested).

AFAICT I agree there is no SLOs, but still these metrics may be helpful to troubleshoot nodes.

TL;DR: AFAIU I can leave this part unchanged, so I'm doing that. Please let me know if I need to change something after the clarification above.


###### Are there any missing metrics that would be useful to have to improve observability of this feature?

We can detail the pinning errors total with a new metric like `cpu_manager_errors_count` or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For GA, I would expect that this metric is there already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, we can add this as GA requirement.
Depends on: kubernetes/kubernetes#129529

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if it depends on that issue, then are we not promoting this to GA in this release? and adding a new metric as a new beta?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we can wait for another cycle, adding metrics in the context of graduation is something we did repeatedly in the past and AFAIK is not controversial or uncommon.
I can also redo the work I did on 129529, it's just a bit wasteful and I'd prefer not to, but I can if this simplifies the flow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we can wait for another cycle, adding metrics in the context of graduation is something we did repeatedly in the past and AFAIK is not controversial or uncommon. I can also redo the work I did on 129529, it's just a bit wasteful and I'd prefer not to, but I can if this simplifies the flow.

Actually no, we don't need to depend on this. On second look the redo is minimal. Will file a new PR. Thanks for the comment!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding them in a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 31, 2025
@ffromani ffromani force-pushed the cpumanager-policy-options-to-ga branch from eb7974c to a439e40 Compare February 3, 2025 08:04
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 3, 2025
@@ -183,80 +199,31 @@ With `threads_per_cpu` is typical 2 on x86_64 with SMT enabled - but this number
In order to make the resource reporting consistent, and avoiding cascading changes in the system, we enforce the request constraints at admission time.
This approach follows what the Topology Manager already does.

### Alternatives
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the alternatives? I think we should keep them for future reference and context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved at the bottom per (AFAIU) new KEP template

@@ -286,7 +250,7 @@ NOTE: Even though the feature gate is enabled by default the user still has to e
The alpha-quality options are hidden by default and only if the `CPUManagerPolicyAlphaOptions` feature gate is enabled the user has the ability to use them.
The beta-quality options are visible by default, and the feature gate allows a positive acknowledgement that non stable features are being used, and also allows to optionally turn them off.
Based on the graduation criteria described below, a policy option will graduate from a group to the other (alpha to beta).
We plan to removete the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` after all options graduated to stable, after a feature cycle passes without new planned options, and not before 1.28, to give ample time to the work in progress option to graduate at least to beta.
We plan to remove the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` after all options graduated to stable, after a feature cycle passes without new planned options, and not before 1.28, to give ample time to the work in progress option to graduate at least to beta.
Copy link
Contributor

@swatisehgal swatisehgal Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we didn't explicitly discuss the removal strategy for feature gates. Removing a feature gate after a single cycle without new policy options seems a bit aggressive, but keeping feature gates indefinitely when they no longer guard any options isn’t ideal either.

Once all policy options have graduated to GA, it might be reasonable to wait at least two cycles before removing the feature gates if no new options are planned? Let's have this discussion during the PRR review of this PR.

Additionally, as new policy options become less frequent, we should revisit whether to reintroduce CPUManagerPolicyAlphaOptions and CPUManagerPolicyBetaOptions when new options are proposed after feature gates have been removed. This would help avoid feature gate proliferation - the main reason we moved away from a feature gate per policy option or determine if we should revert to the standard approach of gating each policy option individually.

@ffromani ffromani changed the title KEP-2625: Promote CPUManager Policy Options to GA KEP-2625: Promote the CPUManager Policy Option full-pcpus-only to GA Feb 4, 2025
Copy link
Contributor

@swatisehgal swatisehgal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Looks good from node-perspective!

@@ -286,7 +250,7 @@ NOTE: Even though the feature gate is enabled by default the user still has to e
The alpha-quality options are hidden by default and only if the `CPUManagerPolicyAlphaOptions` feature gate is enabled the user has the ability to use them.
The beta-quality options are visible by default, and the feature gate allows a positive acknowledgement that non stable features are being used, and also allows to optionally turn them off.
Based on the graduation criteria described below, a policy option will graduate from a group to the other (alpha to beta).
We plan to removete the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` after all options graduated to stable, after a feature cycle passes without new planned options, and not before 1.28, to give ample time to the work in progress option to graduate at least to beta.
We plan to remove the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` after all options graduated to stable, after a feature cycle passes without new planned options, and not before 1.28, to give ample time to the work in progress option to graduate at least to beta.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once PRR team has had a chance to weigh in on this, I would capture this discussion in the KEP or in the issue so that the plan is properly documented for future reference and doesn't get lost in github comments.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 5, 2025
@ffromani ffromani force-pushed the cpumanager-policy-options-to-ga branch from a439e40 to 507487e Compare February 5, 2025 17:28
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 5, 2025
@ffromani
Copy link
Contributor Author

ffromani commented Feb 5, 2025

@swatisehgal fully agree, the outcome must be recorded in the KEP itsellf. Perhaps other sig-arch members should be involved in the convo. I captured your comments and moved in a separate section, PTAL again when you have time

Copy link
Contributor

@swatisehgal swatisehgal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 6, 2025
Copy link
Member

@johnbelamaric johnbelamaric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few points in the comments I sent individually, once those updates are done, PRR should be good to go.

File missing content

Signed-off-by: Francesco Romani <[email protected]>
@ffromani ffromani force-pushed the cpumanager-policy-options-to-ga branch from 507487e to da6db82 Compare February 11, 2025 09:17
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 11, 2025
@ffromani
Copy link
Contributor Author

thanks @johnbelamaric ! I think I addressed all the review comments.

Copy link
Member

@johnbelamaric johnbelamaric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 11, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ffromani, johnbelamaric, mrunalp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 11, 2025
@k8s-ci-robot k8s-ci-robot merged commit d2e54fa into kubernetes:master Feb 11, 2025
4 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.33 milestone Feb 11, 2025
@ffromani ffromani deleted the cpumanager-policy-options-to-ga branch February 11, 2025 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants