Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRUDKubernetesDependentResource should update dependent resource on API version update #2644

Open
thomasdraebing opened this issue Dec 18, 2024 · 7 comments
Milestone

Comments

@thomasdraebing
Copy link

Bug Report

What did you do?

Our operator manages multiple levels of CustomResources, i.e. CR1 will lead to creation/update of CR2 (and other resources) where both CRs are managed by different reconcilers in the same operator. CRD2's API is very stable and once configured values in CR2 rarely change. This is not the case in CR1 though. Both CRs belong to the same API group and thus have the same version, i.e. if the API in CR1 is updated, the version of both CRDs is updated.
We performed such an API version update and updated the CRDs and CR1s in our cluster.

What did you expect to see?

CR2's API version should have been updated.

What did you see instead? Under which circumstances?

While requesting CR2 from the K8s API server returns CR2 with the updated version, the resource stored in ETCD still contains the old API version. This leads to issues after multiple updates, if at one point older versions are being removed from the CRD spec. In our case, the API is still quite unstable and version updates happen frequently.

Environment

Kubernetes cluster type:

Gardener

$ Mention java-operator-sdk version from pom.xml file

4.8.3

$ java -version

From pom.xml (the image used to run the operator):

<image>gcr.io/distroless/java17-debian12</ima

$ kubectl version

Client Version: v1.31.3
Kustomize Version: v5.4.2
Server Version: v1.29.10

Possible Solution

The issue here is that the API version is not taken into account when comparing the currently applied and the wanted state in KubernetesDependentResource.match(). However, since the Reconciler in this case specifically wants to create the dependent resource with the new version, even if the spec didn't change, I believe the API version should be considered in this case.

Additional context

@csviri
Copy link
Collaborator

csviri commented Dec 19, 2024

Hi @thomasdraebing , just to understand better, do you have a conversion hook in place?

I never tried personally but there is also this migrator: https://github.com/kubernetes-sigs/kube-storage-version-migrator

But in general I agree that we could check explicitly the API versions. For now you can do by overriding the matching and checking the API versions before calling super method.

@thomasdraebing
Copy link
Author

Hi @csviri,

so far we don't have conversion webhooks in place, since it was not strictly necessary (API versions introduced new fields, but didn't do any type changes).

I am aware of the migrator, although we had some trouble getting it to work. At the moment, I am using a workaround that uses an annotation containing the API version and including annotations into the matching of resources. This works, but having built in support would be nice and your suggestion to override the matching would already be better. I will have a go at it myself in the next few days, just wanted to bring it up for discussion first.

@thomasdraebing
Copy link
Author

Thinking about this further, it might not be as easy, since the Kubernetes API server will return whichever API version is the one currently marked as stored and not the one that is actually being stored in ETCD. Thus, the operator will actually always get the version that is currently marked as stored, which will most likely be the one the dependent resource should be updated to. Thus, depending on the cache state, the operator itself might not know the actually stored resource version. Next to doing the same as the storage version migrator, the only ideas I can come up with are:

  • same as the workaround I have described above: Have an annotation with the originally applied API version that is added by the operator (similar to javaoperatorsdk.io/previous)
  • since an API version update affecting resources managed by the operator will always require an operator update, there could be an option to always perform an apply/replace after restart/update of the operator.

@csviri
Copy link
Collaborator

csviri commented Dec 19, 2024

since an API version update affecting resources managed by the operator will always require an operator update, there could be an option to always perform an apply/replace after restart/update of the operator.

note that while your API is not stabilized you can just simply always return false from the matcher, that would solve the problem!? But these versions meant to be used with conversion hooks, so when updating that might be needed to work properly in all cases.

@thomasdraebing
Copy link
Author

thomasdraebing commented Dec 20, 2024

note that while your API is not stabilized you can just simply always return false from the matcher, that would solve the problem!?

That would of course work, but cause a lot of unnecessary reconciles. As mentioned, I have already a working workaround using an annotation. I just thought it might be neat to support such a scenario out of the box.

But these versions meant to be used with conversion hooks, so when updating that might be needed to work properly in all cases.

Not sure how a conversion webhook would work in this case, since the dependent resource, if nothing but the API version changes, would not be reconciled and thus the webhook would never be called.

It looks like there is a new alpha feature in Kubernetes 1.30 that provides a StorageVersionMigration resource for this purpose [1]. I guess providing StorageVersionMigrations together with CRD updates will be the way to go in the future.

[1] https://kubernetes.io/docs/tasks/manage-kubernetes-objects/storage-version-migration/

@csviri
Copy link
Collaborator

csviri commented Jan 6, 2025

since an API version update affecting resources managed by the operator will always require an operator update, there could be an option to always perform an apply/replace after restart/update of the operator.

This sounds like something that we could add to the core, but maybe as an opt-in experimental feature, especially since the
storage version migration looks to be solving the issue / will do in future.

@csviri csviri added this to the 5.1 milestone Jan 6, 2025
@csviri
Copy link
Collaborator

csviri commented Jan 6, 2025

Scheduled this for v5.1, but feel free to provide a PR before in case you have the bandwidth, and thank you for raising this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants