Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTA-1224: status: simplify worker status line (2) #1918

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hongkailiu
Copy link
Member

  • Show the number of available and outdated nodes (and thus the number of available and updated can be calculated) if workers' upgrading is started.
  • Show the number of unavailable and progressing nodes if there are unavailable nodes.

/cc @DavidHurta @petr-muller

pkg/cli/admin/upgrade/status/workerpool.go Outdated Show resolved Hide resolved
@@ -6,7 +6,7 @@ All control plane nodes successfully updated to 4.16.0-ec.3
= Worker Upgrade =

WORKER POOL ASSESSMENT COMPLETION STATUS
worker Degraded 37% (22/59) 44 Available, 5 Progressing, 12 Draining, 7 Degraded
worker Degraded 37% (22/59) 44 Available (23 Outdated), 15 Unavailable (Progressing 5), 5 Progressing, 12 Draining, 7 Degraded
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering whether in the X Unavailable (Progressing Y), Z Progressing output the Y and Z values will always be equal and thus will render one of the references redundant. I would suppose a progressing node is also always unavailable, is it not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unavailable is from MCO.
A node is unavailable if

  • it has problems that need admins' attention. We had a card for the reason etc. My feeling is that an unavailable node in this case could be NOT progressing.
  • MCO is updating the node. An unavailable node in this case should be progressing as well.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm.. definitely a lot of dilemmas on how to UX this right. moreover its pretty visible so "better be done right" 😉
honestly, X Unavailable (Progressing Y), Z Progressing looks a bit strange to my eyes.
ideally, those should be made as obvious for the many as possible (in which case either displaying too much or too little is an enemy). that unfortunately is more of a finding a middle-ground between different personal taste, than solving an actual technical challenge.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh unavailable itself sounds a bit ambiguous, specifically because it can have either "the good" or "the bad" kind of unavailable. I wish my linguistical skill could provide a simple yet accurate word to name those two cases distinguishably..

Copy link
Contributor

@DavidHurta DavidHurta Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brainstorming outputs. I am not suggesting or recommending any of the outputs.

worker        Degraded     37% (22/59)   44 Available (23 Outdated), 15 Unavailable (5 Progressing), 12 Draining, 7 Degraded

or

worker        Degraded     37% (22/59)   44 Available (23 Outdated), 15 Unavailable, 5 Progressing, 12 Draining, 7 Degraded

tbh unavailable itself sounds a bit ambiguous, specifically because it can have either "the good" or "the bad" kind of unavailable.

Yes. That is the goal. Not that simple, hehe.

One extreme is having something like "15 are unavailable (5 of them are due to the update; this is normal; the other ones can be problematic)". The other one is having users take the total from the completion column, subtract available, and assuming users know that a progressing node is unavailable, and subtract that from the unavailable. A little bit complex.

For example, my first output tries to tie the progressing to the unavailable (there is a connection between them; 15 are unavailable, but 5 of them are progressing (but maybe it's not that clear?) - the current PR change), and removes the second reference of the progressing (as it is probably redundant IMO).

worker        Degraded     37% (22/59)   44 Available (23 Outdated), 15 Unavailable (5 Progressing), 12 Draining, 7 Degraded

@DavidHurta
Copy link
Contributor

DavidHurta commented Nov 20, 2024

/retitle OTA-1224: status: simplify worker status line (2)

So that we can see the PR in the Jira issue. I find it helpful.

@openshift-ci openshift-ci bot changed the title status: simplify worker status line (2) OTA-1224: status: simplify worker status line (2) Nov 20, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 20, 2024

@hongkailiu: This pull request references OTA-1224 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

  • Show the number of available and outdated nodes (and thus the number of available and updated can be calculated) if workers' upgrading is started.
  • Show the number of unavailable and progressing nodes if there are unavailable nodes.

/cc @DavidHurta @petr-muller

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 20, 2024
Copy link
Contributor

openshift-ci bot commented Nov 20, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: hongkailiu
Once this PR has been reviewed and has the lgtm label, please ask for approval from davidhurta. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

openshift-ci bot commented Nov 21, 2024

@hongkailiu: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-serial 80145ea link true /test e2e-aws-ovn-serial
ci/prow/e2e-agnostic-ovn-cmd 80145ea link true /test e2e-agnostic-ovn-cmd
ci/prow/okd-scos-e2e-aws-ovn 80145ea link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants