Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e: ceph-csi-operator deployment support #4947

Draft
wants to merge 5 commits into
base: devel
Choose a base branch
from

Conversation

iPraveenParihar
Copy link
Contributor

@iPraveenParihar iPraveenParihar commented Nov 7, 2024

Describe what this PR does

This PR supports running e2e tests against a ceph-csi deployment via ceph-csi-operator.

Includes -

  • scripts/deploy-ceph-csi-operator.sh to deploy ceph-csi via operator.
  • e2e/operator.go includes util methods for ceph-csi-operator.
  • changes to handle delete/update of ceph-csi pods.

Is there anything that requires special attention

Do you have any questions?

Is the change backward compatible?

Are there concerns around backward compatibility?

Provide any external context for the change, if any.

For example:

  • Kubernetes links that explain why the change is required
  • CSI spec related changes/catch-up that necessitates this patch
  • golang related practices that necessitates this change

Related issues

Mention any github issues relevant to this PR. Adding below line
will help to auto close the issue once the PR is merged.

Depends-on: #4931
Fixes: #4856

Future concerns

List items that are not part of the PR and do not impact it's
functionality, but are work items that can be taken up subsequently.

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next major release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

  • /retest ci/centos/<job-name>: retest the <job-name> after unrelated
    failure (please report the failure too!)

@iPraveenParihar iPraveenParihar added the component/testing Additional test cases or CI work label Nov 7, 2024
@iPraveenParihar iPraveenParihar self-assigned this Nov 7, 2024
@iPraveenParihar iPraveenParihar force-pushed the e2e/ceph-csi-operator-deployment-support branch from 306a93b to 0013f7a Compare November 7, 2024 06:39
Copy link

github-actions bot commented Dec 7, 2024

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

Copy link

github-actions bot commented Jan 8, 2025

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

Copy link

This pull request has been automatically closed due to inactivity. Please re-open if these changes are still required.

@nixpanic
Copy link
Member

/test ci/centos/mini-e2e-operator/k8s-1.31

@nixpanic
Copy link
Member

/test ci/centos/mini-e2e-operator/k8s-1.31

This failed to pull the quay.io/cephcsi/ceph-csi-operator: (without tag?) from the CI mirror. You will need to add the container-image to images.txt and wait until tomorrow to try again.

@iPraveenParihar iPraveenParihar force-pushed the e2e/ceph-csi-operator-deployment-support branch from 0013f7a to 2e8be35 Compare February 13, 2025 09:54
@iPraveenParihar
Copy link
Contributor Author

/test ci/centos/mini-e2e-operator/k8s-1.31

@iPraveenParihar
Copy link
Contributor Author

/retest ci/centos/mini-e2e-operator/k8s-1.31

@iPraveenParihar iPraveenParihar force-pushed the e2e/ceph-csi-operator-deployment-support branch from 2e8be35 to 91eb086 Compare February 17, 2025 05:12
@iPraveenParihar
Copy link
Contributor Author

/test ci/centos/mini-e2e-operator/k8s-1.31

@iPraveenParihar iPraveenParihar force-pushed the e2e/ceph-csi-operator-deployment-support branch from 91eb086 to a17041e Compare February 17, 2025 06:48
@iPraveenParihar
Copy link
Contributor Author

/retest ci/centos/mini-e2e-operator/k8s-1.31

@iPraveenParihar iPraveenParihar force-pushed the e2e/ceph-csi-operator-deployment-support branch from a17041e to 0d9a18b Compare February 24, 2025 05:13
@iPraveenParihar
Copy link
Contributor Author

/test ci/centos/mini-e2e-operator/k8s-1.31

@iPraveenParihar iPraveenParihar force-pushed the e2e/ceph-csi-operator-deployment-support branch 3 times, most recently from 95811d1 to a00f47f Compare February 24, 2025 09:54
@iPraveenParihar
Copy link
Contributor Author

operator CI failed - logs

  • CephFS failed at fuse recovery test as the mount-info volumeMount is not present (raised a PR on ceph-csi-operator)
  • NFS: failed subvolume count 0 not matching expected count 1 (🤔 it passed on my machine.. probably updated csi-config-map didn't take in effect when test ran.)

@iPraveenParihar
Copy link
Contributor Author

/retest ci/centos/mini-e2e-operator/k8s-1.31

@iPraveenParihar
Copy link
Contributor Author

failed ci logs

  �[38;5;9m[FAILED] failed to test volumeGroupSnapshot: failed to create volume group snapshot: failed to get VolumeGroupSnapshot cephfs-3893-vgs: failed to get volumesnapshot: client rate limiter Wait returned an error: context deadline exceeded�[0m

@iPraveenParihar
Copy link
Contributor Author

/retest ci/centos/mini-e2e-operator/k8s-1.31

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Feb 27, 2025

failed to test volumeGroupSnapsho

@iPraveenParihar it looks like the volumegroupsnapshot policy is not enabled in the Driver/OperatorConfig CR https://github.com/ceph/ceph-csi-operator/blob/6e1c39c697cb397119920bff6551d12ceeb1a514/api/v1alpha1/driver_types.go#L308

@iPraveenParihar
Copy link
Contributor Author

@iPraveenParihar it looks like the volumegroupsnapshot policy is not enabled in the Driver/OperatorConfig CR https://github.com/ceph/ceph-csi-operator/blob/6e1c39c697cb397119920bff6551d12ceeb1a514/api/v1alpha1/driver_types.go#L308

Yes, thanks I noticed it 😀 . Testing on local machine

@iPraveenParihar iPraveenParihar force-pushed the e2e/ceph-csi-operator-deployment-support branch from 62b0cda to 78e933e Compare February 27, 2025 08:54
@iPraveenParihar
Copy link
Contributor Author

/test ci/centos/mini-e2e-operator/k8s-1.31

@iPraveenParihar
Copy link
Contributor Author

I0227 09:35:32.750840       1 utils.go:266] ID: 22 Req-ID: pvc-844f6615-fdee-4b13-b063-2e563b69f91c GRPC call: /csi.v1.Controller/CreateVolume
I0227 09:35:32.751600       1 utils.go:267] ID: 22 Req-ID: pvc-844f6615-fdee-4b13-b063-2e563b69f91c GRPC request: {"capacity_range":{"required_bytes":1073741824},"name":"pvc-844f6615-fdee-4b13-b063-2e563b69f91c","parameters":{"clusterID":"e809629f-2772-4bac-8592-4575ebe7f2c7","fsName":"myfs","nfsCluster":"my-nfs","server":"rook-ceph-nfs-my-nfs-a.rook-ceph.svc.cluster.local","volumeNamePrefix":"nfs-export-"},"secrets":"***stripped***","volume_capabilities":[{"access_mode":{"mode":"SINGLE_NODE_SINGLE_WRITER"},"mount":{"fs_type":"ext4"}}]}
W0227 09:35:32.751628       1 credentials.go:119] adminID and adminKey are deprecated, please use userID and userKey instead
I0227 09:35:32.756124       1 omap.go:89] ID: 22 Req-ID: pvc-844f6615-fdee-4b13-b063-2e563b69f91c got omap values: (pool="myfs-metadata", namespace="csi", name="csi.volumes.default"): map[]
W0227 09:35:32.756146       1 credentials.go:119] adminID and adminKey are deprecated, please use userID and userKey instead
I0227 09:35:32.763523       1 omap.go:159] ID: 22 Req-ID: pvc-844f6615-fdee-4b13-b063-2e563b69f91c set omap keys (pool="myfs-metadata", namespace="csi", name="csi.volumes.default"): map[csi.volume.pvc-844f6615-fdee-4b13-b063-2e563b69f91c:de017c04-4b39-4197-835a-1a0ca60974a6])
I0227 09:35:32.767254       1 omap.go:159] ID: 22 Req-ID: pvc-844f6615-fdee-4b13-b063-2e563b69f91c set omap keys (pool="myfs-metadata", namespace="csi", name="csi.volume.de017c04-4b39-4197-835a-1a0ca60974a6"): map[csi.imagename:nfs-export-de017c04-4b39-4197-835a-1a0ca60974a6 csi.volname:pvc-844f6615-fdee-4b13-b063-2e563b69f91c])
I0227 09:35:32.767276       1 fsjournal.go:318] ID: 22 Req-ID: pvc-844f6615-fdee-4b13-b063-2e563b69f91c Generated Volume ID (0001-0024-e809629f-2772-4bac-8592-4575ebe7f2c7-0000000000000001-de017c04-4b39-4197-835a-1a0ca60974a6) and subvolume name (nfs-export-de017c04-4b39-4197-835a-1a0ca60974a6) for request name (pvc-844f6615-fdee-4b13-b063-2e563b69f91c)
I0227 09:35:32.822304       1 controllerserver.go:475] ID: 22 Req-ID: pvc-844f6615-fdee-4b13-b063-2e563b69f91c cephfs: successfully created backing volume named nfs-export-de017c04-4b39-4197-835a-1a0ca60974a6 for request name pvc-844f6615-fdee-4b13-b063-2e563b69f91c
I0227 09:35:32.822355       1 controllerserver.go:89] ID: 22 Req-ID: pvc-844f6615-fdee-4b13-b063-2e563b69f91c CephFS volume created: 0001-0024-e809629f-2772-4bac-8592-4575ebe7f2c7-0000000000000001-de017c04-4b39-4197-835a-1a0ca60974a6
W0227 09:35:32.822364       1 credentials.go:119] adminID and adminKey are deprecated, please use userID and userKey instead
I0227 09:35:32.827879       1 omap.go:159] ID: 22 Req-ID: pvc-844f6615-fdee-4b13-b063-2e563b69f91c set omap keys (pool="myfs-metadata", namespace="csi", name="csi.volume.de017c04-4b39-4197-835a-1a0ca60974a6"): map[csi.nfs.cluster:my-nfs])
I0227 09:35:32.946820       1 controllerserver.go:116] ID: 22 Req-ID: pvc-844f6615-fdee-4b13-b063-2e563b69f91c published NFS-export: 0001-0024-e809629f-2772-4bac-8592-4575ebe7f2c7-0000000000000001-de017c04-4b39-4197-835a-1a0ca60974a6
I0227 09:35:32.946924       1 utils.go:273] ID: 22 Req-ID: pvc-844f6615-fdee-4b13-b063-2e563b69f91c GRPC response: {"volume":{"capacity_bytes":1073741824,"volume_context":{"backingSnapshot":"false","clusterID":"e809629f-2772-4bac-8592-4575ebe7f2c7","fsName":"myfs","nfsCluster":"my-nfs","server":"rook-ceph-nfs-my-nfs-a.rook-ceph.svc.cluster.local","share":"/0001-0024-e809629f-2772-4bac-8592-4575ebe7f2c7-0000000000000001-de017c04-4b39-4197-835a-1a0ca60974a6","subvolumeName":"nfs-export-de017c04-4b39-4197-835a-1a0ca60974a6","subvolumePath":"/volumes/e2e/nfs-export-de017c04-4b39-4197-835a-1a0ca60974a6/c018210b-da7d-4458-a0ba-91eb9fc94b9a","volumeNamePrefix":"nfs-export-"},"volume_id":"0001-0024-e809629f-2772-4bac-8592-4575ebe7f2c7-0000000000000001-de017c04-4b39-4197-835a-1a0ca60974a6"}}

NFS test failed -

For PVC - pvc-844f6615-fdee-4b13-b063-2e563b69f91c is created in e2e subvolumegroup and expected to be in csi.

I0227 09:35:32.946924       1 utils.go:273] ID: 22 Req-ID: pvc-844f6615-fdee-4b13-b063-2e563b69f91c GRPC response: {"volume":{"capacity_bytes":1073741824,"volume_context":{"backingSnapshot":"false","clusterID":"e809629f-2772-4bac-8592-4575ebe7f2c7","fsName":"myfs","nfsCluster":"my-nfs","server":"rook-ceph-nfs-my-nfs-a.rook-ceph.svc.cluster.local","share":"/0001-0024-e809629f-2772-4bac-8592-4575ebe7f2c7-0000000000000001-de017c04-4b39-4197-835a-1a0ca60974a6","subvolumeName":"nfs-export-de017c04-4b39-4197-835a-1a0ca60974a6","subvolumePath":"/volumes/e2e/nfs-export-de017c04-4b39-4197-835a-1a0ca60974a6/c018210b-da7d-4458-a0ba-91eb9fc94b9a","volumeNamePrefix":"nfs-export-"},"volume_id":"0001-0024-e809629f-2772-4bac-8592-4575ebe7f2c7-0000000000000001-de017c04-4b39-4197-835a-1a0ca60974a6"}}

For NFS BeforeEach call, we create a ConfigMap with csi subvolumegroup. Probably, there was a significant delay in updated ceph-csi-config ConfigMap to be available in csi pods.

Similar thing happened for CephFS test verify rados objects are within a namespace

0227 09:33:27.905137       1 omap.go:159] ID: 257 Req-ID: pvc-11fc74aa-842b-47ab-b73f-95d461cce56a set omap keys (pool="myfs-metadata", namespace="csi", name="csi.volumes.default"): map[csi.volume.pvc-11fc74aa-842b-47ab-b73f-95d461cce56a:64a5c26a-52de-48ed-8511-7493db12421d])
I0227 09:33:27.909233       1 omap.go:159] ID: 257 Req-ID: pvc-11fc74aa-842b-47ab-b73f-95d461cce56a set omap keys (pool="myfs-metadata", namespace="csi", name="csi.volume.64a5c26a-52de-48ed-8511-7493db12421d"): map[csi.imagename:csi-vol-64a5c26a-52de-48ed-8511-7493db12421d csi.volname:pvc-11fc74aa-842b-47ab-b73f-95d461cce56a csi.volume.owner:cephfs-6326])
I0227 09:33:27.909253       1 fsjournal.go:318
  �[38;5;9m[FAILED] failed to validate omap count for rados ls --pool=myfs-metadata --namespace=cephfs-ns | grep -v default | grep -v csi.volume.group. | grep -c ^csi.volume.: expected omap object count 2, got 0�[0m

I'll try to scale down the csi controller pods before updating/creating ConfigMap and then scale up.

@iPraveenParihar iPraveenParihar force-pushed the e2e/ceph-csi-operator-deployment-support branch from 78e933e to a1d6e45 Compare February 27, 2025 12:43
@iPraveenParihar
Copy link
Contributor Author

/test ci/centos/mini-e2e-operator/k8s-1.31

@iPraveenParihar iPraveenParihar force-pushed the e2e/ceph-csi-operator-deployment-support branch from a1d6e45 to 8bc6d5b Compare February 27, 2025 13:29
@iPraveenParihar
Copy link
Contributor Author

/test ci/centos/mini-e2e-operator/k8s-1.31

This commit adds `operatorDeployment` flag, defaults to false.
Set this to true for running test on ceph-csi deployment via
operator.

Signed-off-by: Praveen M <[email protected]>
@iPraveenParihar iPraveenParihar force-pushed the e2e/ceph-csi-operator-deployment-support branch from 8bc6d5b to 1bd6592 Compare February 27, 2025 14:52
@iPraveenParihar
Copy link
Contributor Author

/test ci/centos/mini-e2e/k8s-1.31

@iPraveenParihar
Copy link
Contributor Author

/test ci/centos/mini-e2e-helm/k8s-1.31

@iPraveenParihar
Copy link
Contributor Author

/test ci/centos/mini-e2e-operator/k8s-1.32

This commits adds e2e/operator.go containing utility
methods specific to the operator.

Signed-off-by: Praveen M <[email protected]>
Signed-off-by: Praveen M <[email protected]>
@iPraveenParihar
Copy link
Contributor Author

operator deployment CI passed logs,

helm CI test failed due to label selector not provided. logs

@iPraveenParihar iPraveenParihar force-pushed the e2e/ceph-csi-operator-deployment-support branch from 1bd6592 to 8e120c0 Compare February 28, 2025 01:16
@iPraveenParihar
Copy link
Contributor Author

/test ci/centos/mini-e2e-operator/k8s-1.32

@iPraveenParihar
Copy link
Contributor Author

/test ci/centos/mini-e2e-helm/k8s-1.31

@iPraveenParihar
Copy link
Contributor Author

/test ci/centos/mini-e2e/k8s-1.31

@iPraveenParihar
Copy link
Contributor Author

/retest ci/centos/mini-e2e-helm/k8s-1.31

@iPraveenParihar
Copy link
Contributor Author

/test ci/centos/mini-e2e-operator/k8s-1.32

@iPraveenParihar
Copy link
Contributor Author

/retest ci/centos/mini-e2e-helm/k8s-1.31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/testing Additional test cases or CI work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run e2e with ceph-csi deployed via ceph-csi-operator
3 participants