Skip to content

Conversation

@sadasu
Copy link
Contributor

@sadasu sadasu commented Oct 29, 2025

1 worker node is allowed only when SNO is configured on platforms None or External. Add validation disallowing 1 worker node for other scenarios.

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 29, 2025
@openshift-ci-robot
Copy link
Contributor

@sadasu: This pull request references Jira Issue OCPBUGS-56954, which is invalid:

  • expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

1 worker node is allowed only when SNO is configured on platforms None or External. Add validation disallowing 1 worker node for other scenarios.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from jhixson74 and rna-afk October 29, 2025 18:36
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 29, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign sadasu for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sadasu
Copy link
Contributor Author

sadasu commented Oct 29, 2025

/jira refresh

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Oct 29, 2025
@openshift-ci-robot
Copy link
Contributor

@sadasu: This pull request references Jira Issue OCPBUGS-56954, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @gpei

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot removed the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Oct 29, 2025
@openshift-ci openshift-ci bot requested a review from gpei October 29, 2025 18:41
@sadasu sadasu force-pushed the disallow-1-worker branch 3 times, most recently from 6ecd1b4 to 569a247 Compare October 30, 2025 14:59
1 worker node is allowed only when SNO is configured on platforms
None or External. Add validation disallowing 1 worker node for
other scenarios.
@sadasu sadasu force-pushed the disallow-1-worker branch from 569a247 to 9dbd36b Compare October 30, 2025 15:22
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 30, 2025

@sadasu: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/artifacts-images 9dbd36b link true /test artifacts-images
ci/prow/images 9dbd36b link true /test images
ci/prow/e2e-aws-ovn 9dbd36b link true /test e2e-aws-ovn
ci/prow/okd-scos-e2e-aws-ovn 9dbd36b link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@sadasu
Copy link
Contributor Author

sadasu commented Oct 31, 2025

/payload ?

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 31, 2025

@sadasu: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info.


// Validate that exactly 1 worker node is allowed only when a single control plane node is provisioned on platforms None or External
if computeReplicas == 1 && controlReplicas != 1 && !(platform.None != nil || platform.External != nil) {
allErrs = append(allErrs, field.Invalid(fldPath, computeReplicas, "exactly 1 worker node is not allowed for this platform and configuration. Use 0 workers for single-node clusters or 2 workers for multi-node clusters"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we say

Suggested change
allErrs = append(allErrs, field.Invalid(fldPath, computeReplicas, "exactly 1 worker node is not allowed for this platform and configuration. Use 0 workers for single-node clusters or 2 workers for multi-node clusters"))
allErrs = append(allErrs, field.Invalid(fldPath, computeReplicas, "exactly 1 worker node is not allowed for this platform and configuration. Use 0 workers for single-node clusters or at least 2 workers for multi-node clusters"))

in this error msg?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 as users might mistake it to exactly 2. We should probably use the correct field path:

Suggested change
allErrs = append(allErrs, field.Invalid(fldPath, computeReplicas, "exactly 1 worker node is not allowed for this platform and configuration. Use 0 workers for single-node clusters or 2 workers for multi-node clusters"))
allErrs = append(allErrs, field.Invalid(fldPath.Child("replicas"), computeReplicas, "exactly 1 worker node is not allowed for this platform and configuration. Use 0 workers for single-node clusters or at least 2 workers for multi-node clusters"))

@gpei
Copy link
Contributor

gpei commented Nov 3, 2025

Performed pre-merge testing with this PR, and the installer correctly handles the various scenarios where 1 worker is set in the install-config.

  1. Installer blocks 1 worker + 3 control plane on AWS with error message
ERROR failed to fetch Master Machines: failed to load asset "Install Config": failed to create install config: invalid "install-config.yaml" file: compute: Invalid value: 1: exactly 1 worker node is not allowed for this platform and configuration. Use 0 workers for single-node clusters or 2 workers for multi-node clusters 
  1. Installer allows 1 worker + 1 control plane on Azure
  2. Installer allows compact cluster, with 0 worker + 3 control plane
  3. Installer allows SNO cluster, with 0 worker + 1 control plane
  4. Installer allows 1 worker + 3 control plane on none/external platform


// Validate that exactly 1 worker node is allowed only when a single control plane node is provisioned on platforms None or External
if computeReplicas == 1 && controlReplicas != 1 && !(platform.None != nil || platform.External != nil) {
allErrs = append(allErrs, field.Invalid(fldPath, computeReplicas, "exactly 1 worker node is not allowed for this platform and configuration. Use 0 workers for single-node clusters or 2 workers for multi-node clusters"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 as users might mistake it to exactly 2. We should probably use the correct field path:

Suggested change
allErrs = append(allErrs, field.Invalid(fldPath, computeReplicas, "exactly 1 worker node is not allowed for this platform and configuration. Use 0 workers for single-node clusters or 2 workers for multi-node clusters"))
allErrs = append(allErrs, field.Invalid(fldPath.Child("replicas"), computeReplicas, "exactly 1 worker node is not allowed for this platform and configuration. Use 0 workers for single-node clusters or at least 2 workers for multi-node clusters"))

Comment on lines +835 to +836
// Validate that exactly 1 worker node is allowed only when a single control plane node is provisioned on platforms None or External
if computeReplicas == 1 && controlReplicas != 1 && !(platform.None != nil || platform.External != nil) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validate that exactly 1 worker node is allowed only when a single control plane node is provisioned on platforms None or External

Regarding the comment, it hints that only SNO and platform none/external is allowed. But from #10042 (comment) and official doc, either SNO or none/external platform is allowed, which also matches the current code implementation.

I guess the comment is incorrect?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Validate that exactly 1 worker node is allowed only when a single control plane node is provisioned on platforms None or External
if computeReplicas == 1 && controlReplicas != 1 && !(platform.None != nil || platform.External != nil) {
// Validate that exactly 1 worker node is allowed only in either the following scenarios:
// 1. A single control plane node (SNO) is provisioned.
// 2. Platforms are None or External.
if computeReplicas == 1 {
isSNO := controlReplicas == 1
isAllowedPlatform := platform.None != nil || platform.External != nil
if !isSNO && !isAllowedPlatform {

nit: It seems a bit more verbose, but improves readability? WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1+1 is actually not a valid topology on day 1. Assisted is not capable of creating an SNO with worker nodes.
But 3+1 is actually a valid topology, the bug report notwithstanding.

{
name: "one compute replica, one control plane, AWS",
installConfig: func() *types.InstallConfig {
c := validInstallConfig()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
c := validInstallConfig()
c := validInstallConfig()
c.ControlPlane.Replicas = ptr.Int64(1)

The default now is 3 right? As in line 33-37. So we need to set it 1, which makes this case valid (i.e. expectedError to be nil)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants