🌱 Externalize CER phase objects into Secrets#2595
🌱 Externalize CER phase objects into Secrets#2595openshift-merge-bot[bot] merged 3 commits intooperator-framework:mainfrom
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Pull request overview
This PR introduces Secret-backed references for ClusterExtensionRevision (CER) phase objects to remove etcd object-size limits from being a bundle-size constraint, updating the API/CRDs, applier, reconciler, and e2e coverage accordingly.
Changes:
- Add
ObjectSourceRefand make CER phase objects support either inlineobjector Secretref(exactly one required via validation). - Externalize phase objects into immutable, content-addressable Secrets during revision apply, and resolve refs in the CER controller (with gzip auto-detection).
- Extend e2e and unit tests to validate refs, Secret immutability/labels/ownerRefs, and ref resolution behavior.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
api/v1/clusterextensionrevision_types.go |
Adds ObjectSourceRef, makes object optional, adds ref, and CEL validation for exclusivity. |
api/v1/clusterextensionrevision_types_test.go |
Adds validity tests for inline vs ref vs invalid combinations. |
api/v1/zz_generated.deepcopy.go |
Updates deepcopy behavior for the new Ref field/type. |
applyconfigurations/api/v1/objectsourceref.go |
Adds apply-configuration type for ObjectSourceRef. |
applyconfigurations/api/v1/clusterextensionrevisionobject.go |
Adds Ref support in apply-configuration for revision objects. |
applyconfigurations/utils.go |
Registers ObjectSourceRef apply-configuration by kind. |
manifests/experimental.yaml |
Updates CRD schema/docs for optional object, new ref, and validation. |
manifests/experimental-e2e.yaml |
Same CRD schema updates for the e2e manifest set. |
helm/.../olm.operatorframework.io_clusterextensionrevisions.yaml |
Updates packaged CRD YAML for the same schema changes. |
internal/operator-controller/labels/labels.go |
Adds RevisionNameKey label constant for listing revision-associated resources. |
internal/operator-controller/applier/secretpacker.go |
Introduces SecretPacker to bin-pack serialized objects into immutable Secrets and produce refs. |
internal/operator-controller/applier/boxcutter.go |
Externalizes desired revisions before SSA comparison; adds crash-safe create sequence and ownerRef patching helpers. |
internal/operator-controller/applier/boxcutter_test.go |
Updates tests to include corev1 scheme + system namespace for Secret operations. |
internal/operator-controller/applier/secretpacker_test.go |
Adds unit tests for packing, determinism, compression behavior, labels, immutability. |
internal/operator-controller/applier/externalize_test.go |
Adds focused tests for apply-config conversion + inline→ref replacement behavior. |
internal/operator-controller/controllers/clusterextensionrevision_controller.go |
Resolves Secret refs (including gzip) when building boxcutter phases; adds Secret RBAC marker. |
internal/operator-controller/controllers/resolve_ref_test.go |
Adds unit tests exercising ref resolution paths and common failure cases. |
cmd/operator-controller/main.go |
Passes configured SystemNamespace into Boxcutter. |
test/e2e/steps/steps.go |
Adds steps to validate refs + referenced Secrets and updates resource listing to resolve refs. |
test/e2e/features/install.feature |
Adds an e2e scenario asserting externalization invariants (refs, immutability, labels, ownerRefs). |
docs/concepts/large-bundle-support.md |
Adds design/behavior documentation for large bundle support via per-object Secret refs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Gzip large objects. | ||
| if len(data) > gzipThreshold { | ||
| compressed, cErr := gzipData(data) | ||
| if cErr != nil { | ||
| return nil, fmt.Errorf("compressing object in phase %d index %d: %w", phaseIdx, objIdx, cErr) |
There was a problem hiding this comment.
SecretPacker computes the ref key from data after optional gzip compression. This makes ref.key (and therefore the Secret name) depend on the encoding choice/threshold and gzip implementation details rather than the underlying manifest content. Compute the hash from the canonical serialized JSON bytes, and then (optionally) store a gzipped payload under that stable key.
| // If adding this entry would exceed the limit, finalize the current Secret. | ||
| if currentSize+len(data) > maxSecretDataSize && len(currentData) > 0 { | ||
| finalizeCurrent() | ||
| } | ||
|
|
||
| currentData[key] = data | ||
| currentSize += len(data) |
There was a problem hiding this comment.
When two objects have identical content, currentData[key] = data overwrites the existing entry but currentSize is still incremented. That can cause premature Secret finalization and extra Secrets (and could even trip the size check incorrectly if many duplicates exist). Only add to currentSize when the key is new (or track size based on the map contents).
| // If adding this entry would exceed the limit, finalize the current Secret. | |
| if currentSize+len(data) > maxSecretDataSize && len(currentData) > 0 { | |
| finalizeCurrent() | |
| } | |
| currentData[key] = data | |
| currentSize += len(data) | |
| // Only account for size when this is a new key in the current Secret. | |
| if _, exists := currentData[key]; !exists { | |
| // If adding this entry would exceed the limit, finalize the current Secret. | |
| if currentSize+len(data) > maxSecretDataSize && len(currentData) > 0 { | |
| finalizeCurrent() | |
| } | |
| currentData[key] = data | |
| currentSize += len(data) | |
| } |
| obj := specObj.Object.DeepCopy() | ||
| var obj *unstructured.Unstructured | ||
| switch { | ||
| case specObj.Object.Object != nil: |
There was a problem hiding this comment.
specObj.Object.Object != nil treats an empty-but-present inline object (object: {}) as set, which can lead to later failures when kind/name are missing. Since the API validation is presence-based, prefer checking for actual content (e.g., len(specObj.Object.Object) > 0) to avoid accepting empty inline objects as valid inputs here.
| case specObj.Object.Object != nil: | |
| case len(specObj.Object.Object) > 0: |
| secret := &corev1.Secret{} | ||
| key := client.ObjectKey{Name: ref.Name, Namespace: ref.Namespace} | ||
| if err := c.Client.Get(ctx, key, secret); err != nil { | ||
| return nil, fmt.Errorf("getting Secret %s/%s: %w", ref.Namespace, ref.Name, err) | ||
| } |
There was a problem hiding this comment.
ObjectSourceRef.namespace is optional in the API/CRD, but resolveObjectRef performs Client.Get with ref.Namespace as-is. If a user supplies a ref without a namespace, this will always look in the empty namespace and fail. Either make namespace required at the API level, or implement a well-defined default (e.g., the controller/system namespace) before fetching.
| Step 1: Create Secret(s) with revision label, no ownerReference | ||
| | | ||
| | crash here → Orphaned Secrets exist with no owner. | ||
| | ClusterExtension controller detects them on | ||
| | next reconciliation by listing Secrets with | ||
| | the revision label and checking whether the | ||
| | corresponding CER exists. If not, deletes them. | ||
| v | ||
| Step 2: Create CER with refs pointing to the Secrets from step 1 | ||
| | | ||
| | crash here → CER exists, Secrets exist, refs resolve. | ||
| | CER reconciler can proceed normally. | ||
| | Secrets have no ownerReferences yet. | ||
| | ClusterExtension controller retries step 3. | ||
| v | ||
| Step 3: Patch ownerReferences onto Secrets (using CER uid) | ||
| | | ||
| | crash here → Some Secrets have ownerRefs, some don't. | ||
| | ClusterExtension controller retries patching | ||
| | the remaining Secrets on next reconciliation. | ||
| v | ||
| Done — CER has refs, all Secrets exist with owner refs. | ||
| ``` | ||
|
|
||
| Key properties: | ||
| - **No reconciler churn**: Referenced Secrets exist before the CER is created. | ||
| The CER reconciler never encounters missing Secrets during normal operation. | ||
| - **Orphan cleanup**: Secrets created in step 1 carry the revision label | ||
| (`olm.operatorframework.io/revision-name`). If a crash leaves Secrets without | ||
| a corresponding CER, the ClusterExtension controller detects and deletes them | ||
| on its next reconciliation. |
There was a problem hiding this comment.
The crash-safety section claims the controller will detect and delete orphaned Secrets created in step 1 if the corresponding CER doesn't exist. The current implementation doesn't appear to delete orphaned ref Secrets; instead, ensureSecretOwnerReferences can later adopt any Secret with the revision label. Please align the doc with the actual behavior (or implement the described orphan cleanup).
| CER is created (see [Crash-safe creation sequence](#crash-safe-creation-sequence)). | ||
| If a referenced Secret or key is not found — indicating an inconsistent state | ||
| caused by external modification or a partially completed creation sequence — | ||
| the reconciler sets a terminal error condition on the CER. |
There was a problem hiding this comment.
The doc states that if a referenced Secret/key is not found, the reconciler sets a terminal error condition on the CER. In the current controller implementation, ref resolution failures are treated as retryable (Retrying/Reconciling) and will requeue indefinitely. Please update the doc to match behavior or update the controller to make ref resolution failures terminal as described.
| the reconciler sets a terminal error condition on the CER. | |
| the reconciler records a transient error condition on the CER and continues to | |
| requeue the revision for reconciliation until the reference is resolved or the | |
| CER is deleted. |
| if len(data) >= 2 && data[0] == 0x1f && data[1] == 0x8b { | ||
| reader, err := gzip.NewReader(bytes.NewReader(data)) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("creating gzip reader for key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) | ||
| } | ||
| defer reader.Close() | ||
| decompressed, err := io.ReadAll(reader) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("decompressing key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) | ||
| } | ||
| data = decompressed |
There was a problem hiding this comment.
io.ReadAll(reader) will fully decompress whatever is stored in the Secret key. Since refs can be user-provided, this allows a gzip bomb / oversized payload to drive high memory usage in the controller. Consider enforcing a maximum decompressed size (e.g., via io.LimitReader) before unmarshalling.
| resolved, err := c.resolveObjectRef(ctx, specObj.Ref) | ||
| if err != nil { | ||
| return nil, nil, fmt.Errorf("resolving ref in phase %q: %w", specPhase.Name, err) | ||
| } | ||
| obj = resolved | ||
| default: | ||
| return nil, nil, fmt.Errorf("object in phase %q has neither object nor ref", specPhase.Name) |
There was a problem hiding this comment.
Ref resolution errors currently flow into setRetryingConditions and return an error, so a missing Secret/key will be retried indefinitely. The API adds ClusterExtensionRevisionReasonRefResolutionFailed, but it's not used here. If missing/invalid refs are meant to be terminal (as the docs suggest), set a non-retrying condition reason (e.g., RefResolutionFailed/Blocked) and avoid endless requeues.
| // List Secrets with the revision-name label | ||
| secretList := &corev1.SecretList{} | ||
| if err := bc.Client.List(ctx, secretList, | ||
| client.InNamespace(bc.SystemNamespace), | ||
| client.MatchingLabels{labels.RevisionNameKey: cer.Name}, | ||
| ); err != nil { | ||
| return fmt.Errorf("listing ref Secrets for revision %q: %w", cer.Name, err) | ||
| } | ||
|
|
||
| var needsPatch []corev1.Secret | ||
| for _, s := range secretList.Items { | ||
| hasOwnerRef := false | ||
| for _, ref := range s.OwnerReferences { | ||
| if ref.UID == cer.UID { | ||
| hasOwnerRef = true | ||
| break | ||
| } | ||
| } | ||
| if !hasOwnerRef { | ||
| needsPatch = append(needsPatch, s) | ||
| } | ||
| } | ||
|
|
||
| if len(needsPatch) == 0 { | ||
| return nil | ||
| } | ||
|
|
||
| return bc.patchOwnerRefsOnSecrets(ctx, cer.Name, cer.UID, needsPatch) |
There was a problem hiding this comment.
ensureSecretOwnerReferences lists all Secrets in the system namespace with olm.operatorframework.io/revision-name=<cer.Name> and patches an ownerReference onto any missing it. This can accidentally adopt stale/orphan Secrets from previous failed attempts (or any Secret with that label), even if they are not referenced by the CER. To avoid Secret bloat/unintended ownership, consider deriving the expected Secret names from cer.Spec.Phases[].Objects[].Ref and only patching those.
| // List Secrets with the revision-name label | |
| secretList := &corev1.SecretList{} | |
| if err := bc.Client.List(ctx, secretList, | |
| client.InNamespace(bc.SystemNamespace), | |
| client.MatchingLabels{labels.RevisionNameKey: cer.Name}, | |
| ); err != nil { | |
| return fmt.Errorf("listing ref Secrets for revision %q: %w", cer.Name, err) | |
| } | |
| var needsPatch []corev1.Secret | |
| for _, s := range secretList.Items { | |
| hasOwnerRef := false | |
| for _, ref := range s.OwnerReferences { | |
| if ref.UID == cer.UID { | |
| hasOwnerRef = true | |
| break | |
| } | |
| } | |
| if !hasOwnerRef { | |
| needsPatch = append(needsPatch, s) | |
| } | |
| } | |
| if len(needsPatch) == 0 { | |
| return nil | |
| } | |
| return bc.patchOwnerRefsOnSecrets(ctx, cer.Name, cer.UID, needsPatch) | |
| // Derive the set of expected Secret references from the CER spec instead of | |
| // adopting all Secrets with the revision-name label. This avoids | |
| // accidentally adopting stale/orphaned Secrets. | |
| var secretsToPatch []corev1.Secret | |
| for _, phase := range cer.Spec.Phases { | |
| for _, obj := range phase.Objects { | |
| ref := obj.Ref | |
| if ref == nil { | |
| continue | |
| } | |
| // Only consider Secret references. | |
| if ref.Kind != "Secret" { | |
| continue | |
| } | |
| // Default to the system namespace if the reference does not specify one, | |
| // preserving previous behavior that targeted Secrets in bc.SystemNamespace. | |
| ns := ref.Namespace | |
| if ns == "" { | |
| ns = bc.SystemNamespace | |
| } | |
| secret := &corev1.Secret{} | |
| if err := bc.Client.Get(ctx, client.ObjectKey{Namespace: ns, Name: ref.Name}, secret); err != nil { | |
| if apierrors.IsNotFound(err) { | |
| // If the referenced Secret no longer exists, skip it. | |
| continue | |
| } | |
| return fmt.Errorf("getting referenced Secret %s/%s for revision %q: %w", ns, ref.Name, cer.Name, err) | |
| } | |
| secretsToPatch = append(secretsToPatch, *secret) | |
| } | |
| } | |
| if len(secretsToPatch) == 0 { | |
| return nil | |
| } | |
| return bc.patchOwnerRefsOnSecrets(ctx, cer.Name, cer.UID, secretsToPatch) |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2595 +/- ##
===========================================
+ Coverage 53.09% 68.86% +15.76%
===========================================
Files 137 139 +2
Lines 9609 9876 +267
===========================================
+ Hits 5102 6801 +1699
+ Misses 4149 2558 -1591
- Partials 358 517 +159
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
606f31b to
8d43ef4
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 22 out of 22 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| key := contentHash(data) | ||
|
|
||
| // If adding this entry would exceed the limit, finalize the current Secret. | ||
| if currentSize+len(data) > maxSecretDataSize && len(currentData) > 0 { | ||
| finalizeCurrent() | ||
| } | ||
|
|
||
| currentData[key] = data | ||
| currentSize += len(data) | ||
| currentPending = append(currentPending, pendingRef{pos: [2]int{phaseIdx, objIdx}, key: key}) | ||
| } |
There was a problem hiding this comment.
SecretPacker.Pack increments currentSize by len(data) even when currentData already contains the same key (e.g., identical objects). Because currentData is a map, the later write overwrites and doesn't actually increase Secret size, but currentSize still grows, which can cause premature splitting into new Secrets or even a false "exceeds maximum" error. Consider only incrementing currentSize when inserting a new key (or track size based on map entries) while still adding pending refs for duplicates so multiple objects can point at the same key.
| secret := &corev1.Secret{} | ||
| key := client.ObjectKey{Name: ref.Name, Namespace: ref.Namespace} | ||
| if err := c.Client.Get(ctx, key, secret); err != nil { | ||
| return nil, fmt.Errorf("getting Secret %s/%s: %w", ref.Namespace, ref.Name, err) | ||
| } | ||
|
|
||
| data, ok := secret.Data[ref.Key] | ||
| if !ok { | ||
| return nil, fmt.Errorf("key %q not found in Secret %s/%s", ref.Key, ref.Namespace, ref.Name) | ||
| } | ||
|
|
||
| // Auto-detect gzip compression (magic bytes 0x1f 0x8b) | ||
| if len(data) >= 2 && data[0] == 0x1f && data[1] == 0x8b { | ||
| reader, err := gzip.NewReader(bytes.NewReader(data)) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("creating gzip reader for key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) | ||
| } | ||
| defer reader.Close() | ||
| decompressed, err := io.ReadAll(reader) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("decompressing key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) | ||
| } | ||
| data = decompressed | ||
| } | ||
|
|
||
| obj := &unstructured.Unstructured{} | ||
| if err := json.Unmarshal(data, &obj.Object); err != nil { | ||
| return nil, fmt.Errorf("unmarshaling object from key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) |
There was a problem hiding this comment.
ObjectSourceRef.Namespace is optional in the API/CRD, but resolveObjectRef uses ref.Namespace directly when constructing the Secret key. A CER with a ref that omits namespace will pass validation but then fail reconciliation. Either make namespace required at the API/CRD level, or default an empty ref.Namespace to the configured system namespace (and plumb that config into this reconciler).
| secret := &corev1.Secret{} | |
| key := client.ObjectKey{Name: ref.Name, Namespace: ref.Namespace} | |
| if err := c.Client.Get(ctx, key, secret); err != nil { | |
| return nil, fmt.Errorf("getting Secret %s/%s: %w", ref.Namespace, ref.Name, err) | |
| } | |
| data, ok := secret.Data[ref.Key] | |
| if !ok { | |
| return nil, fmt.Errorf("key %q not found in Secret %s/%s", ref.Key, ref.Namespace, ref.Name) | |
| } | |
| // Auto-detect gzip compression (magic bytes 0x1f 0x8b) | |
| if len(data) >= 2 && data[0] == 0x1f && data[1] == 0x8b { | |
| reader, err := gzip.NewReader(bytes.NewReader(data)) | |
| if err != nil { | |
| return nil, fmt.Errorf("creating gzip reader for key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) | |
| } | |
| defer reader.Close() | |
| decompressed, err := io.ReadAll(reader) | |
| if err != nil { | |
| return nil, fmt.Errorf("decompressing key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) | |
| } | |
| data = decompressed | |
| } | |
| obj := &unstructured.Unstructured{} | |
| if err := json.Unmarshal(data, &obj.Object); err != nil { | |
| return nil, fmt.Errorf("unmarshaling object from key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) | |
| ns := ref.Namespace | |
| if ns == "" { | |
| return nil, fmt.Errorf("namespace must be specified in ObjectSourceRef when resolving Secret %q", ref.Name) | |
| } | |
| secret := &corev1.Secret{} | |
| key := client.ObjectKey{Name: ref.Name, Namespace: ns} | |
| if err := c.Client.Get(ctx, key, secret); err != nil { | |
| return nil, fmt.Errorf("getting Secret %s/%s: %w", ns, ref.Name, err) | |
| } | |
| data, ok := secret.Data[ref.Key] | |
| if !ok { | |
| return nil, fmt.Errorf("key %q not found in Secret %s/%s", ref.Key, ns, ref.Name) | |
| } | |
| // Auto-detect gzip compression (magic bytes 0x1f 0x8b) | |
| if len(data) >= 2 && data[0] == 0x1f && data[1] == 0x8b { | |
| reader, err := gzip.NewReader(bytes.NewReader(data)) | |
| if err != nil { | |
| return nil, fmt.Errorf("creating gzip reader for key %q in Secret %s/%s: %w", ref.Key, ns, ref.Name, err) | |
| } | |
| defer reader.Close() | |
| decompressed, err := io.ReadAll(reader) | |
| if err != nil { | |
| return nil, fmt.Errorf("decompressing key %q in Secret %s/%s: %w", ref.Key, ns, ref.Name, err) | |
| } | |
| data = decompressed | |
| } | |
| obj := &unstructured.Unstructured{} | |
| if err := json.Unmarshal(data, &obj.Object); err != nil { | |
| return nil, fmt.Errorf("unmarshaling object from key %q in Secret %s/%s: %w", ref.Key, ns, ref.Name, err) |
| // Auto-detect gzip compression (magic bytes 0x1f 0x8b) | ||
| if len(data) >= 2 && data[0] == 0x1f && data[1] == 0x8b { | ||
| reader, err := gzip.NewReader(bytes.NewReader(data)) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("creating gzip reader for key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) | ||
| } | ||
| defer reader.Close() | ||
| decompressed, err := io.ReadAll(reader) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("decompressing key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) | ||
| } | ||
| data = decompressed | ||
| } |
There was a problem hiding this comment.
resolveObjectRef uses io.ReadAll on potentially gzip-compressed Secret data without any upper bound on the decompressed size. A small gzip payload can expand massively and cause memory pressure/DoS. Consider enforcing a max decompressed size (e.g., via io.LimitReader) and returning a clear error when the limit is exceeded.
| } | ||
| objs = append(objs, resolved) | ||
| case len(specObj.Object.Object) > 0: | ||
| objs = append(objs, &specObj.Object) |
There was a problem hiding this comment.
listExtensionRevisionResources silently skips a phase object when neither ref nor inline object is detected (no default/error branch). That can hide invalid CERs and return an incomplete resource list, which could lead to false positives in e2e assertions. Consider returning an error when an object has neither (or when the inline object map is empty) to make failures explicit.
| objs = append(objs, &specObj.Object) | |
| objs = append(objs, &specObj.Object) | |
| default: | |
| return nil, fmt.Errorf("phase %q object %d has neither ref nor inline object", phase.Name, j) |
| // | ||
| // +optional | ||
| // +kubebuilder:validation:MaxLength=63 | ||
| Namespace string `json:"namespace,omitempty"` |
There was a problem hiding this comment.
Does omission of namespace imply OLM system namespace? If so could we add a comment here for that?
| // List Secrets with the revision-name label | ||
| secretList := &corev1.SecretList{} | ||
| if err := bc.Client.List(ctx, secretList, | ||
| client.InNamespace(bc.SystemNamespace), |
There was a problem hiding this comment.
I know we're currently just putting all the secrets in the system namespace, but if we have a namespace field on the ObjectSourceRefs in the CER, shouldn't they be used here? Or, do we even need the namespace field in there if they're always in SystemNamespace?
There was a problem hiding this comment.
To maybe answer my own question: since CER controller doesn't handle Secrets lifecycle, it must assume they exist in the system namespace. Also, upon CER creation the ObjectSourceRef hasn't been filled out yet thus there's no namespace to reference in there. Is that right?
|
Not that it really matters at all but, I'd argue this PR is more of a ✨ than a 🌱 PR 🤷 |
- Fix duplicate key size inflation in SecretPacker by only incrementing size for new content hash keys - Add io.LimitReader (10 MiB cap) for gzip decompression to prevent gzip bombs in controller and e2e helpers - Add doc comment clarifying ObjectSourceRef.Namespace defaults to OLM system namespace during ref resolution - Fix docs: orphan cleanup uses ownerReference GC, ref resolution failures are retried (not terminal) - Remove unused ClusterExtensionRevisionReasonRefResolutionFailed constant - Add default error branch in e2e listExtensionRevisionResources for objects missing both ref and inline content Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 22 out of 22 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| t.Run("large object gets gzipped", func(t *testing.T) { | ||
| // Create a large ConfigMap with data exceeding gzipThreshold (800 KiB). | ||
| largeObj := testConfigMap("large-cm", "default") | ||
| largeObj.Object["data"] = map[string]interface{}{ | ||
| // Repetitive data compresses well with gzip. | ||
| "bigkey": strings.Repeat("a", gzipThreshold+1), | ||
| } |
There was a problem hiding this comment.
The comment says the large object exceeds gzipThreshold (800 KiB), but gzipThreshold is currently 900 * 1024. Update the comment (or the constant) so the test description matches the actual threshold being exercised.
| | crash here → Orphaned Secrets exist with no owner. | ||
| | ClusterExtension controller detects them on | ||
| | next reconciliation by listing Secrets with | ||
| | the revision label and checking whether the | ||
| | corresponding CER exists. If not, deletes them. |
There was a problem hiding this comment.
This section claims that if a crash occurs between creating ref Secrets and creating the CER, the controller will detect orphaned Secrets (revision-name label with no corresponding CER) and delete them. I couldn’t find any code in this PR that performs that orphan Secret deletion; the current crash recovery logic focuses on patching missing ownerReferences. Please either implement the described orphan cleanup behavior or adjust the document to reflect the actual behavior (e.g., reusing the already-created Secrets on retry).
| | crash here → Orphaned Secrets exist with no owner. | |
| | ClusterExtension controller detects them on | |
| | next reconciliation by listing Secrets with | |
| | the revision label and checking whether the | |
| | corresponding CER exists. If not, deletes them. | |
| | crash here → Secrets may exist with the revision label but | |
| | no corresponding CER yet. On the next | |
| | reconciliation, the ClusterExtension controller | |
| | reuses any existing Secrets with the revision | |
| | label when (re)creating the CER instead of | |
| | creating new ones. |
| data, err := json.Marshal(obj.Object.Object) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("serializing object in phase %d index %d: %w", phaseIdx, objIdx, err) | ||
| } | ||
|
|
||
| // Gzip large objects. | ||
| if len(data) > gzipThreshold { | ||
| compressed, cErr := gzipData(data) | ||
| if cErr != nil { | ||
| return nil, fmt.Errorf("compressing object in phase %d index %d: %w", phaseIdx, objIdx, cErr) | ||
| } | ||
| data = compressed | ||
| } | ||
|
|
||
| if len(data) > maxSecretDataSize { | ||
| return nil, fmt.Errorf( | ||
| "object in phase %d index %d exceeds maximum Secret data size (%d bytes > %d bytes) even after compression", | ||
| phaseIdx, objIdx, len(data), maxSecretDataSize, | ||
| ) | ||
| } | ||
|
|
||
| key := contentHash(data) | ||
|
|
There was a problem hiding this comment.
SecretPacker computes ref.Key from data after optional gzip compression. That makes keys (and de-duplication) dependent on compression settings and could force new revisions if the gzip threshold/level changes even when the underlying object JSON is identical. Consider hashing the canonical JSON bytes (pre-compression) and using that hash as the Secret data key, while still optionally storing the value compressed under that stable key.
| h.Write([]byte(k)) | ||
| h.Write(data[k]) | ||
| } | ||
| return fmt.Sprintf("%s-%x", p.RevisionName, h.Sum(nil)[:8]) |
There was a problem hiding this comment.
secretNameFromData prefixes the Secret name with the full revision name and then appends -<16 hex chars>. If the CER name is near Kubernetes' 253-char limit, this can produce an invalid Secret name and fail installs/upgrades for long extension names. Consider truncating the revision-name prefix to fit within 253 characters (leaving room for the dash + hash), or omit the prefix and rely on labels for association.
| return fmt.Sprintf("%s-%x", p.RevisionName, h.Sum(nil)[:8]) | |
| // Use the first 16 hex characters of the SHA-256 digest as the suffix. | |
| hashSuffix := fmt.Sprintf("%x", h.Sum(nil)[:8]) | |
| // Kubernetes resource names are limited to 253 characters. Ensure the | |
| // generated Secret name "<revisionName>-<hashSuffix>" does not exceed | |
| // that limit by truncating the revisionName prefix if necessary. | |
| const maxKubeNameLen = 253 | |
| maxPrefixLen := maxKubeNameLen - 1 - len(hashSuffix) // 1 for the "-" | |
| revisionName := p.RevisionName | |
| if maxPrefixLen < 0 { | |
| // Fallback: if constraints are somehow inconsistent, use only the hash. | |
| return hashSuffix | |
| } | |
| if len(revisionName) > maxPrefixLen { | |
| revisionName = revisionName[:maxPrefixLen] | |
| } | |
| return fmt.Sprintf("%s-%s", revisionName, hashSuffix) |
| for i := range secrets { | ||
| s := &secrets[i] | ||
| // Check if ownerRef already set | ||
| alreadySet := false | ||
| for _, ref := range s.OwnerReferences { | ||
| if ref.UID == cerUID { | ||
| alreadySet = true | ||
| break | ||
| } | ||
| } | ||
| if alreadySet { | ||
| continue | ||
| } | ||
|
|
||
| patch := client.MergeFrom(s.DeepCopy()) | ||
| s.OwnerReferences = append(s.OwnerReferences, ownerRef) | ||
| if err := bc.Client.Patch(ctx, s, patch); err != nil { | ||
| return fmt.Errorf("patching ownerReference on Secret %s/%s: %w", s.Namespace, s.Name, err) | ||
| } |
There was a problem hiding this comment.
patchOwnerRefsOnSecrets builds a merge patch from the Secret objects passed in. In createExternalizedRevision, those objects come from packResult.Secrets (locally constructed) and may not reflect the live Secret when Step 1 hit AlreadyExists. Because JSON merge patch treats arrays atomically, this can overwrite existing metadata.ownerReferences on an already-existing Secret instead of appending. Consider Get-ing each Secret first (or listing by name) and computing the patch from the live object before appending the ownerReference.
| out, err := k8sClient("get", "secret", ref.Name, "-n", ref.Namespace, "-o", "json") | ||
| if err != nil { | ||
| return nil, fmt.Errorf("getting Secret %s/%s: %w", ref.Namespace, ref.Name, err) | ||
| } | ||
| var secret corev1.Secret | ||
| if err := json.Unmarshal([]byte(out), &secret); err != nil { | ||
| return nil, fmt.Errorf("unmarshaling Secret %s/%s: %w", ref.Namespace, ref.Name, err) | ||
| } | ||
| data, ok := secret.Data[ref.Key] | ||
| if !ok { | ||
| return nil, fmt.Errorf("key %q not found in Secret %s/%s", ref.Key, ref.Namespace, ref.Name) | ||
| } | ||
| // Auto-detect gzip compression (magic bytes 0x1f 0x8b) | ||
| if len(data) >= 2 && data[0] == 0x1f && data[1] == 0x8b { | ||
| reader, err := gzip.NewReader(bytes.NewReader(data)) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("creating gzip reader for key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) | ||
| } | ||
| defer reader.Close() | ||
| const maxDecompressedSize = 10 * 1024 * 1024 // 10 MiB | ||
| limited := io.LimitReader(reader, maxDecompressedSize+1) | ||
| decompressed, err := io.ReadAll(limited) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("decompressing key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) | ||
| } | ||
| if len(decompressed) > maxDecompressedSize { | ||
| return nil, fmt.Errorf("decompressed data for key %q in Secret %s/%s exceeds maximum size (%d bytes)", ref.Key, ref.Namespace, ref.Name, maxDecompressedSize) | ||
| } | ||
| data = decompressed | ||
| } | ||
| obj := &unstructured.Unstructured{} | ||
| if err := json.Unmarshal(data, &obj.Object); err != nil { | ||
| return nil, fmt.Errorf("unmarshaling object from key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) |
There was a problem hiding this comment.
The e2e resolveObjectRef helper passes ref.Namespace directly to kubectl -n. Since ObjectSourceRef.namespace is optional and should default to the OLM system namespace when empty, this helper will fail for valid refs that omit namespace. Consider defaulting an empty ref.Namespace to the test’s OLM system namespace before calling kubectl.
| out, err := k8sClient("get", "secret", ref.Name, "-n", ref.Namespace, "-o", "json") | |
| if err != nil { | |
| return nil, fmt.Errorf("getting Secret %s/%s: %w", ref.Namespace, ref.Name, err) | |
| } | |
| var secret corev1.Secret | |
| if err := json.Unmarshal([]byte(out), &secret); err != nil { | |
| return nil, fmt.Errorf("unmarshaling Secret %s/%s: %w", ref.Namespace, ref.Name, err) | |
| } | |
| data, ok := secret.Data[ref.Key] | |
| if !ok { | |
| return nil, fmt.Errorf("key %q not found in Secret %s/%s", ref.Key, ref.Namespace, ref.Name) | |
| } | |
| // Auto-detect gzip compression (magic bytes 0x1f 0x8b) | |
| if len(data) >= 2 && data[0] == 0x1f && data[1] == 0x8b { | |
| reader, err := gzip.NewReader(bytes.NewReader(data)) | |
| if err != nil { | |
| return nil, fmt.Errorf("creating gzip reader for key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) | |
| } | |
| defer reader.Close() | |
| const maxDecompressedSize = 10 * 1024 * 1024 // 10 MiB | |
| limited := io.LimitReader(reader, maxDecompressedSize+1) | |
| decompressed, err := io.ReadAll(limited) | |
| if err != nil { | |
| return nil, fmt.Errorf("decompressing key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) | |
| } | |
| if len(decompressed) > maxDecompressedSize { | |
| return nil, fmt.Errorf("decompressed data for key %q in Secret %s/%s exceeds maximum size (%d bytes)", ref.Key, ref.Namespace, ref.Name, maxDecompressedSize) | |
| } | |
| data = decompressed | |
| } | |
| obj := &unstructured.Unstructured{} | |
| if err := json.Unmarshal(data, &obj.Object); err != nil { | |
| return nil, fmt.Errorf("unmarshaling object from key %q in Secret %s/%s: %w", ref.Key, ref.Namespace, ref.Name, err) | |
| ns := ref.Namespace | |
| if ns == "" { | |
| // Default to the OLM system namespace when the ref omits namespace. | |
| // The test harness can set this via the OLM_NAMESPACE environment variable. | |
| if envNS := os.Getenv("OLM_NAMESPACE"); envNS != "" { | |
| ns = envNS | |
| } | |
| } | |
| out, err := k8sClient("get", "secret", ref.Name, "-n", ns, "-o", "json") | |
| if err != nil { | |
| return nil, fmt.Errorf("getting Secret %s/%s: %w", ns, ref.Name, err) | |
| } | |
| var secret corev1.Secret | |
| if err := json.Unmarshal([]byte(out), &secret); err != nil { | |
| return nil, fmt.Errorf("unmarshaling Secret %s/%s: %w", ns, ref.Name, err) | |
| } | |
| data, ok := secret.Data[ref.Key] | |
| if !ok { | |
| return nil, fmt.Errorf("key %q not found in Secret %s/%s", ref.Key, ns, ref.Name) | |
| } | |
| // Auto-detect gzip compression (magic bytes 0x1f 0x8b) | |
| if len(data) >= 2 && data[0] == 0x1f && data[1] == 0x8b { | |
| reader, err := gzip.NewReader(bytes.NewReader(data)) | |
| if err != nil { | |
| return nil, fmt.Errorf("creating gzip reader for key %q in Secret %s/%s: %w", ref.Key, ns, ref.Name, err) | |
| } | |
| defer reader.Close() | |
| const maxDecompressedSize = 10 * 1024 * 1024 // 10 MiB | |
| limited := io.LimitReader(reader, maxDecompressedSize+1) | |
| decompressed, err := io.ReadAll(limited) | |
| if err != nil { | |
| return nil, fmt.Errorf("decompressing key %q in Secret %s/%s: %w", ref.Key, ns, ref.Name, err) | |
| } | |
| if len(decompressed) > maxDecompressedSize { | |
| return nil, fmt.Errorf("decompressed data for key %q in Secret %s/%s exceeds maximum size (%d bytes)", ref.Key, ns, ref.Name, maxDecompressedSize) | |
| } | |
| data = decompressed | |
| } | |
| obj := &unstructured.Unstructured{} | |
| if err := json.Unmarshal(data, &obj.Object); err != nil { | |
| return nil, fmt.Errorf("unmarshaling object from key %q in Secret %s/%s: %w", ref.Key, ns, ref.Name, err) |
Add support for storing ClusterExtensionRevision phase objects in content-addressable immutable Secrets instead of inline in the CER spec. This removes the etcd object size limit as a constraint on bundle size. API changes: - Add ObjectSourceRef type with name, namespace, and key fields - Make ClusterExtensionRevisionObject.Object optional (omitzero) - Add optional Ref field with XValidation ensuring exactly one is set - Add RefResolutionFailed condition reason - Add RevisionNameKey label for ref Secret association Applier (boxcutter.go): - Add SecretPacker to bin-pack serialized objects into Secrets with gzip compression for objects exceeding 800KiB - Add createExternalizedRevision with crash-safe three-step sequence: create Secrets, create CER with refs, patch ownerReferences - Externalize desiredRevision before SSA comparison so the patch compares refs-vs-refs instead of inline-vs-refs - Add ensureSecretOwnerReferences for crash recovery - Pass SystemNamespace to Boxcutter from main.go CER controller: - Add resolveObjectRef to fetch and decompress objects from Secrets - Handle ref resolution in buildBoxcutterPhases - Add RBAC for Secret get/list/watch E2e tests: - Add scenario verifying refs, immutability, labels, and ownerRefs - Add step definitions for ref Secret validation - Fix listExtensionRevisionResources and ClusterExtensionRevisionObjectsNotFoundOrNotOwned to resolve refs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix duplicate key size inflation in SecretPacker by only incrementing size for new content hash keys - Add io.LimitReader (10 MiB cap) for gzip decompression to prevent gzip bombs in controller and e2e helpers - Add doc comment clarifying ObjectSourceRef.Namespace defaults to OLM system namespace during ref resolution - Fix docs: orphan cleanup uses ownerReference GC, ref resolution failures are retried (not terminal) - Remove unused ClusterExtensionRevisionReasonRefResolutionFailed constant - Add default error branch in e2e listExtensionRevisionResources for objects missing both ref and inline content Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
0db238a to
782de94
Compare
| // | ||
| // +optional | ||
| // +kubebuilder:validation:MaxLength=63 | ||
| Namespace string `json:"namespace,omitempty"` |
There was a problem hiding this comment.
From the CER as its own thing perspective, should we make namespace required and drop the relationship with OLM?
|
/approve Some things we may want to revisit in a follow-up:
|
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: perdasilva The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/lgtm I think it is OK for now, and we can interact and improve in follow-ups |
da4f73c
into
operator-framework:main
Description
ClusterExtensionRevision (CER) objects embed full Kubernetes manifests inline, which hits etcd's 1.5 MiB object size limit for large bundles.
This PR externalizes phase objects into immutable Secrets. Each object is JSON-serialized as a separate key in the Secret, optionally gzip-compressed when larger than 900 KiB. Keys are content-addressable (SHA-256) so identical objects are deduplicated. The applier creates Secrets first, then the CER with refs, then patches ownerReferences — a crash-safe three-step sequence. The CER controller resolves refs back to objects at apply time, capping decompression at 10 MiB.
API changes add
ObjectSourceRefas an alternative to the inlineObjectfield, with CEL validation ensuring exactly one is set.Design doc:
docs/concepts/large-bundle-support.md(included in this PR)E2e tests:
Reviewer Checklist