Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CASMPET-7117: iSCSI SBPS provision and DNS config fails when HSN is not configured/ CASMPET-7126: CFS play for k8s labelling fails when it is already applied #284

Merged
merged 1 commit into from
Aug 9, 2024

Conversation

ravikanth-nalla-hpe
Copy link
Contributor

@ravikanth-nalla-hpe ravikanth-nalla-hpe commented Jul 10, 2024

Summary and Scope

  • CASMPET-7117: iSCSI SBPS: LIO provision and DNS records config fails when HSN is not configured

    • fixed iSCSI LIO provisioning to exclude HSN portal config when HSN n/w is not configured
    • fixed to avoid DNS "SRV" and "A" records creation for HSN when HSN is not configured
  • CASMPET-7126: iSCSI SBPS: k8s labelling fails when it is already applied

    • fixed to avoid applying k8s label when it already exists

Issues and Related PRs

Original EPIC: https://jira-pro.it.hpe.com:8443/browse/CASMPET-6797

Testing

Tested on bare metal systems with multi worker nodes

Tested on:

"starlord" and "Odin"

Test description:

  • verified LIO provisioning where HSN n/w is not configured, where it does not fail the overall config instead ignores HSN and creates rest of the portals (NMN and CMN).

  • verified LIO provisioning where HSN n/w is configured, it creates all three n/w portals HSN, NMN and CMN.

  • verified DNS "SRV" and "A" records creation where HSN n/w is not configured, it creates “SRV” and “A” records for NMN without any failure.

  • verified DNS "SRV" and "A" records creation where HSN n/w is configured, it creates “SRV” and “A” records for both HSN and NMN without any failure.

  • verified k8s node labelling fresh and also when it is already applied. K8s node labelling is successful without any failure in both cases.

  • Were the install/upgrade-based validation checks/tests run (goss tests/install-validation doc)?

  • Were continuous integration tests run? If not, why?

  • Was upgrade tested? If not, why?

  • Was downgrade tested? If not, why?

  • Were new tests (or test issues/Jiras) created for this change?

Risks and Mitigations

None

Pull Request Checklist

  • [] Version number(s) incremented, if applicable
  • Copyrights updated
  • [] License file intact
  • Target branch correct
  • CHANGELOG.md updated
  • Testing is appropriate and complete, if applicable
  • HPC Product Announcement prepared, if applicable

@dborman-hpe
Copy link

Previously I didn't notice all the hard-coded references to hpc.amslabs.hpecorp.net. Those all need to be fixed to dynamically determine the host's domain name and use that instead.

@dborman-hpe
Copy link

Please run the shell scripts through shellcheck and address the issues it finds (yes, I know some of the issues were there before, but it'd be good to clean them up)

@ravikanth-nalla-hpe
Copy link
Contributor Author

Please run the shell scripts through shellcheck and address the issues it finds (yes, I know some of the issues were there before, but it'd be good to clean them up)

I have addressed all the "shellcheck" thrown warnings except the one below from file apply_k8s_label.sh

Line 29:
if [[ ! $(kubectl get nodes -l $LABEL | grep "$HOST_NAME") ]]
        ^-- SC2143 (style): Use grep -q instead of comparing output with [ -n .. ].
$

It is suggesting to use "-q" (quite option) but that does not work since the cmd returns "0" always (for both the cases where label is present / not present. Just that it won't print the output to stdout). Also, not opting to use "--overwrite" option just for the sake of avoiding this warning from "shellcheck". Since ansible is idempotent, just don't want to overwrite config.

@ravikanth-nalla-hpe ravikanth-nalla-hpe mentioned this pull request Aug 4, 2024
7 tasks
@dborman-hpe
Copy link

dborman-hpe commented Aug 5, 2024

Please run the shell scripts through shellcheck and address the issues it finds (yes, I know some of the issues were there before, but it'd be good to clean them up)

I have addressed all the "shellcheck" thrown warnings except the one below from file apply_k8s_label.sh

Line 29:
if [[ ! $(kubectl get nodes -l $LABEL | grep "$HOST_NAME") ]]
        ^-- SC2143 (style): Use grep -q instead of comparing output with [ -n .. ].
$

It is suggesting to use "-q" (quite option) but that does not work since the cmd returns "0" always (for both the cases where label is present / not present. Just that it won't print the output to stdout). Also, not opting to use "--overwrite" option just for the sake of avoiding this warning from "shellcheck". Since ansible is idempotent, just don't want to overwrite config.

It works fine for me:

ncn-m001:~ # kubectl get nodes -l $LABEL
NAME       STATUS   ROLES                  AGE   VERSION
ncn-m001   Ready    control-plane,master   13d   v1.22.13
ncn-w001   Ready    <none>                 13d   v1.22.13
ncn-w002   Ready    <none>                 13d   v1.22.13
ncn-w003   Ready    <none>                 13d   v1.22.13
ncn-w004   Ready    <none>                 13d   v1.22.13
ncn-m001:~ # kubectl get nodes -l $LABEL | grep -q ncn-w001
ncn-m001:~ # echo $?
0
ncn-m001:~ # kubectl get nodes -l $LABEL | grep -q ncn-m002
ncn-m001:~ # echo $?
1
ncn-m001:~ # 

You have to remove both $(...) and [[...]], so the change should be to simplify it to just:

if ! kubectl get nodes -l $LABEL | grep -q "$HOST_NAME"

@dborman-hpe
Copy link

@ravikanth-nalla-hpe, this looks good. Fix the final shellcheck error and this is ready to be merged, please squash everything down to a single commit.

Copy link

@dborman-hpe dborman-hpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@ravikanth-nalla-hpe
Copy link
Contributor Author

@mharding-hpe Can you please merge this PR to master branch?

@@ -42,4 +42,4 @@
# Configure SBPS DNS "SRV" and "A" records
- role: csm.sbps.dns_srv_records
# Mount s3 bucket 'boot-images' using s3fs read-only policy for SBPS agent
- role: csm.sbps.mount_s3_images
#- role: csm.sbps.mount_s3_images
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to check in commented lines. If this no longer is needed, we should just remove the line.

Suggested change
#- role: csm.sbps.mount_s3_images

Comment on lines 29 to 32
if ! kubectl get nodes -l $LABEL | grep -q "$HOST_NAME"
then
kubectl label nodes "$HOST_NAME" $LABEL
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if ! kubectl get nodes -l $LABEL | grep -q "$HOST_NAME"
then
kubectl label nodes "$HOST_NAME" $LABEL
fi
# If the label is already applied to this node, then exit
kubectl get nodes -l "$LABEL" --no-headers | grep -Eq "^${HOST_NAME}[[:space:]]" && exit 0
# Otherwise, apply the label to the node
kubectl label nodes "$HOST_NAME" "$LABEL"

@mharding-hpe
Copy link
Contributor

@mharding-hpe Can you please merge this PR to master branch?

First we will need to merge it into develop, once we resolve the couple of small review comments. Then we can merge it to master.

@mharding-hpe
Copy link
Contributor

You'll also need to rebase on top of the changes that have been made to develop

CASMPET-7126: CFS play for k8s labelling fails when it is already applied

   - fixed iSCSI SBPS provisioning to exclude HSN portal config when HSN is not configured
   - fixed to avoid DNS SRV and A records creation for HSN when HSN is not configured
   - fixed to avoid applying k8s node lebel when it is already applied
@mharding-hpe mharding-hpe merged commit 9369ba7 into develop Aug 9, 2024
4 of 5 checks passed
@mharding-hpe mharding-hpe deleted the feature/CASMPET-7117 branch August 9, 2024 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants