Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with SSA Matching and Approach to Handle Them #2038

Closed
csviri opened this issue Aug 29, 2023 · 4 comments · Fixed by #2042
Closed

Issues with SSA Matching and Approach to Handle Them #2038

csviri opened this issue Aug 29, 2023 · 4 comments · Fixed by #2042
Assignees

Comments

@csviri
Copy link
Collaborator

csviri commented Aug 29, 2023

Matching in dependent resources with Kubernetes objects is quite complex problem. When we SSA based resource create and update was introduced, we introduced also the SSABasedGenericKubernetesResourceMatcher, where the expectation was that this will cover nicely majority of cases (definitely not all of them).

Having some issues with SSA based matching was expected, there were obvious cases which are problematic. But there are always fallback strategies, either GenericKubernetesResourceMatcher or just implemeting a custom matcher.

This issue is to give an overview of the current problems and agree on strategy how to proceed with the SSA based matcher.

Problems

  1. Most of the problem we see is actually about how Kubernetes post processes resources after are applied, typical case (what is actually a bug in K8S, when stringData is converted to data in a Secret, but in the managed fields still stringData is referenced. See: Refinements to SSA matching #2028

  2. Other problem is, for example with default values in Ingress, see: Repeated server side apply creates additional revisions kubernetes/kubernetes#118519 (comment) . When the empty string value is removed, while in desired empty string is expected.

  3. Third example when, some values are added to the resource, but mainly some fields are removed ( since the feature what it is used for is behind a feature flag gate in K8S. See: NullPointerException in SSABasedGenericKubernetesResourceMatcher #2032 and related integration test where it actually works. Note that in this case typically in spec part status is added to volume claim template:

      status:
         phase: pending

Solutions

These problem are quite specific to those resources, but most of the cases we can just solve but adjust the desired resource. So if we fill in the default values (or remove the empty string for example, or the fields behind feature gate), the problem disappears. See in: #2037

So either:

  1. we delegate the problem to the users, that we say that please fill in the default values (or remove), and the matching will work. In case of stringData we can simply say that "this is a convenience feature, please don't use it".

  2. the framework could add (or make it easy to add) corrections, so if it seems that some known default values are missing, like mentioned above, it will fill in the default values. This can happen either in matching phase or as post processing the desired resource. The problem with this approach is that the implementation will end up reproducing K8S API Server logic, that might be complex (and might change based on the k8s version?)

@csviri csviri self-assigned this Aug 29, 2023
@csviri
Copy link
Collaborator Author

csviri commented Aug 29, 2023

cc @metacosm @shawkins

@metacosm
Copy link
Collaborator

Isn't solution 2 what admission controllers are supposed to do? I don't think that we should be duplicating this in the SDK logic. If SSA-based matcher / updater cannot prove reliable enough, maybe it shouldn't be the default? If we make it the default, we should try to detect the problematic cases, catch them and issue an appropriate exception with a detailed error message so that users could make the appropriate changes.
Another option would be to direct users to write their own matcher/updaters, which should be made as simple as possible. This would probably be also be much more efficient than a generic approach relying on deser…

@csviri
Copy link
Collaborator Author

csviri commented Aug 29, 2023

Isn't solution 2 what admission controllers are supposed to do?

not really, dynamic Admission Controllers are generic purpose validation and mutation aparatus.

Another option would be to direct users to write their own matcher/updaters, which should be made as simple as possible.

Basically that is the aim yes, so having customization options like the ignore list for GenericKubernetesResourceMatcher, is a way to be able to customize it.

If SSA-based matcher / updater cannot prove reliable enough, maybe it shouldn't be the default?

It's a good question which one should be the default, the GenericKubernetesResourceMatcher in those cases would not fail too.

f we make it the default, we should try to detect the problematic cases, catch them and issue an appropriate exception with a detailed error message so that users could make the appropriate changes.

yeah, but that we can also fix that, if we detect it.

So what I mean is that this never will be perfect, what we can do, as you also said:

  • Provide some easily extensible default toolkit
  • But maybe cover some common cases, like add there defaults, which commonly causes problems.

@shawkins
Copy link
Collaborator

It's a good question which one should be the default, the GenericKubernetesResourceMatcher in those cases would not fail too.

The non-ssa generic matcher is tolerant to things being added to the spec, but will detect a false positive on any case that involves pruning - that is problematic as it can lead to an indefinite reconciliation loop in the cases where any SSA produces an extra revision. One possible hedge against this is a general update filter: https://github.com/operator-framework/java-operator-sdk/pull/2028/files#diff-cc200829acb22dc9f55cb690b4065bda5465a7aa27ace4941893b6d70ab34ef2 - you'll still go ahead and make the update, but as long as nothing actually changes between the old and new version, you'll skip processing the update.

The only way to proactively avoid the additional updates as @csviri is mentioning is additional sanitizing or type specific matching logic to compenstate for what the api-server is doing.

@csviri csviri linked a pull request Sep 4, 2023 that will close this issue
@csviri csviri closed this as completed Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants