-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve conditions reporting #1605
base: main
Are you sure you want to change the base?
Improve conditions reporting #1605
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
71d7b21
to
128b55c
Compare
daf6d7e
to
1bc86a0
Compare
/test all |
return nil | ||
} | ||
|
||
slices.SortFunc(conditions, func(a, b common.Condition) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No sorting in SetCondition?
But with sorting probably When designing Conditions for a resource, it's helpful to have a common top-level condition which summarizes more detailed conditions. Simple consumers may simply query the top-level condition. Although they are not a consistent standard, the Ready and Succeeded condition types may be used by API designers for long-running and bounded-execution objects, respectively. is not applicable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No sorting in SetCondition?
the sort here is probably a leftover of various iterations, I don't think it does make any sense here (probably also the sort done in the SetCondition
is probably not needed)
@lburgazzoli: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
instance.Status.InstalledComponents = make(map[string]bool) | ||
} | ||
|
||
err := computeComponentsStatus(ctx, rr.Client, instance, cr.DefaultRegistry()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can probably can take whole rr
with .Conditions to access them via Manager
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed
notReadyComponents = append(notReadyComponents, component.GetName()) | ||
return err | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is probably possible to create new Manager for ci's conditions and use it for access
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was also thinking to extend the manager to be able to extract conditions from other managers, so you can have a sort of hierarchy. But I didn't want to make it too complicated for a POC.
I'll take this into account while splitting this PR in smaller chunks.
if !ok { | ||
return fmt.Errorf("resource instance %v is not a componentApi.DataSciencePipelines", rr.Instance) | ||
} | ||
rr.Conditions.MarkTrue(status.ConditionArgoWorkflowAvailable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every such call modifies (and then remodifies) condition in the CR's status. May be some modification api to Condition object can be created and then one call to rr.Conditions.SetCondition() once?
if newCondition.LastTransitionTime.IsZero() { | ||
newCondition.LastTransitionTime = metav1.NewTime(time.Now()) | ||
} | ||
*conditions = append(*conditions, newCondition) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are modifying in place, may be it makes sense to change ConditionsAccessor to return pointer as well? Then a bit less of SetConditions calls would be needed. Otherwise it probably should probably take the slice (not pointer) and return modified one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At one point, I did want to replace the explicit collection passing with a ConditionsAccessor
instance to make things less verbose and confusing,
// eventually carrying it from an old implementation | ||
newCondition.LastHeartbeatTime = nil | ||
|
||
existingCondition := FindStatusCondition(*conditions, newCondition.Type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It gives us pointer to the condition itself and then all the different fields are replaced one by one. Why not to replace the whole structure?
I would probably even avoid the optimization and replaced uncoditionally. It would require to split Manager.SetCondition method to split with and without happiness to avoid infinite recursion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that code is a verbatim copy of the code shipped in the k8s.io/apimachinery/pkg/api/meta
package, it must be refactored a little bit
4c0236a
to
0d5bf02
Compare
7657d4e
to
1089d36
Compare
@ykaliuta made some changes |
Description
Important
This PR is mainly intended to gather feedback on some work and decide how to move forward.
This work was intended to implement RHOAIENG-18216, so to make reconciliation errors be visible also in the status sub-resource of the various CRs, however it ended up being a larger chunk of work. In case we decide to move forward with this approach, the PR must be split in smaller, incremental PRs.
I did a little bit of research on how conditions are reported by other operator and I ended up taking a lot of inspiration from how the Knative operators are handling with conditions, so:
So now, for every API that is implemented with the reconciler framework, we now have at least two conditions:
Ready
which report the overall status of the resourceProvisioningSucceeded
which reports any error eventually happening during the reconciliationIf any component has additional conditions, then it can declare them (if not all the conditions will be taken into account).
So as an example, in the
ModelRegistry
component I now have:Which results in having its status populated with the following conditions:
An the ready condition is reported a failing, because the
DeploymentsAvailable
conditions is not satisfied.The
DataScienceCluster
has also been refactored to use the reconciler framework and as consequence behave the same, so it has a top levelReady
condition, aProvisioningSucceeded
one, and a dedicatedComponentsReady
condition that reports the overall status of the provisioned components (individual components conditions are still reported).Note
in this case the
Ready
condition is reported as being satisfied, even if theComponentsReady
is not. This is because theseverity
is marked asInfo
(the default isError
and it is being represented by an empty value)This
severity
field can be useful to report some specific states, so as an example, theKserve
component would report theReady
condition as true, even ifServerlessAvailable
andServiceMeshAvailable
are not (in this case because the component is configured explicitly to not use serverless)Important
As part of this work, some other changes have been made:
DataScienceReconciler
to use the reconciler frameworksgc
action to offer more configurable options and usable also to remove non managed componentslastHeartBeatTime
from DSC's conditionsHow Has This Been Tested?
Screenshot or short clip
Merge criteria
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).The developer has manually tested the changes and verified that the changes work