Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Jenkins Scripts for Testbed Migration #6751

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

XinShuYang
Copy link
Contributor

  • Upgraded Windows images to restore CI on the latest kubernetes cluster
  • Updated external node names for testbed migration.

@XinShuYang
Copy link
Contributor Author

/test-windows-all

@XinShuYang XinShuYang force-pushed the migrationimage branch 3 times, most recently from 3ca1e4d to 3bf64fd Compare October 17, 2024 12:10
@XinShuYang
Copy link
Contributor Author

/test-windows-e2e

antrea_images=("e2eteam/agnhost:2.13" "us.gcr.io/k8s-artifacts-prod/e2e-test-images/agnhost:2.13" "e2eteam/jessie-dnsutils:1.0" "e2eteam/pause:3.2")
k8s_images=("registry.k8s.io/e2e-test-images/agnhost:2.52" "registry.k8s.io/e2e-test-images/jessie-dnsutils:1.5" "registry.k8s.io/e2e-test-images/nginx:1.14-2" "registry.k8s.io/pause:3.8" "registry.k8s.io/pause:3.10")
conformance_images=("k8sprow.azurecr.io/kubernetes-e2e-test-images/agnhost:2.52" "k8sprow.azurecr.io/kubernetes-e2e-test-images/jessie-dnsutils:1.5" "k8sprow.azurecr.io/kubernetes-e2e-test-images/nginx:1.14-2" "k8sprow.azurecr.io/kubernetes-e2e-test-images/pause:3.8" "registry.k8s.io/e2e-test-images/pause:3.10")
e2e_images=("${DOCKER_REGISTRY}/antrea/toolbox:1.3-0" "registry.k8s.io/e2e-test-images/agnhost:2.40")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest version of toolbox is toolbox:1.4-0, you can update it.

conformance_images=("k8sprow.azurecr.io/kubernetes-e2e-test-images/agnhost:2.45" "k8sprow.azurecr.io/kubernetes-e2e-test-images/jessie-dnsutils:1.5" "k8sprow.azurecr.io/kubernetes-e2e-test-images/nginx:1.14-2" "k8sprow.azurecr.io/kubernetes-e2e-test-images/pause:3.8")
e2e_images=("toolbox:1.3-0")
antrea_images=("e2eteam/agnhost:2.13" "us.gcr.io/k8s-artifacts-prod/e2e-test-images/agnhost:2.13" "e2eteam/jessie-dnsutils:1.0" "e2eteam/pause:3.2")
k8s_images=("registry.k8s.io/e2e-test-images/agnhost:2.52" "registry.k8s.io/e2e-test-images/jessie-dnsutils:1.5" "registry.k8s.io/e2e-test-images/nginx:1.14-2" "registry.k8s.io/pause:3.8" "registry.k8s.io/pause:3.10")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are two pause images needed? same questions for the agnhost image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The agnhost image used by the Antrea e2e test is version 2.40, while the one used by the latest Kubernetes conformance test is 2.52. As for the pause image, I think we can use the latest version for testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we can update the e2e test codes to use the latest agnhost:2.52 as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either works for me, but I would prefer updating it in a separate PR, as I haven’t tested the Windows e2e with this image before the testbed was shut down. Also, we need to update related code in all CI scripts, but the goal of this PR is just for the migration.

@XinShuYang
Copy link
Contributor Author

/test-vm-e2e

Comment on lines 153 to 155
kubectl get nodes --selector=kubernetes.io/os=linux --no-headers=true -o custom-columns=IP:.status.addresses[0].address | while read -r IP; do
rsync -avr --progress --inplace -e "ssh -o StrictHostKeyChecking=no" ${WORKDIR}/*.yml jenkins@${IP}:${WORKDIR}/
done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to copy the manifest to all Nodes?

Copy link
Contributor Author

@XinShuYang XinShuYang Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is only used by the control-plane node. I have updated this part, thanks for the suggestion.

Comment on lines 146 to 148
kubectl get nodes --selector=kubernetes.io/os=linux --no-headers=true -o custom-columns=IP:.status.addresses[0].address | while read -r IP; do
rsync -avr --progress --inplace -e "ssh -o StrictHostKeyChecking=no" $TEMP_ANTREA_TAR jenkins@${IP}:${WORKDIR}/$TEMP_ANTREA_TAR
ssh -o StrictHostKeyChecking=no -n jenkins@${IP} "crictl rmi --prune; crictl ps --state Exited; ctr -n=k8s.io images import ${WORKDIR}/$TEMP_ANTREA_TAR" || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity, why didn't we have that step (loading the image to the Nodes) before?

Copy link
Contributor Author

@XinShuYang XinShuYang Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, both Docker and containerd were installed on the same node. To avoid potential compatibility issues after kubernetes removed docker runtime, we have implemented the new topology for all jenkins testbeds: one node (the jumper node) has a Jenkins agent,Docker and kubeconfig installed, while another node has containerd and Kubernetes installed. When running CI, we need to copy the image from the jumper node to the Kubernetes node.

@@ -525,7 +525,7 @@ func testANPOnVMs(t *testing.T, data *TestData, vmList []vmInfo, osType string)
})
// Test FQDN rules in ANP
t.Run("testANPOnExternalNodeWithFQDN", func(t *testing.T) {
testANPWithFQDN(t, data, "anp-vmagent-fqdn", namespace, *appliedToVM, []string{"www.facebook.com"}, []string{"docs.google.com"}, []string{"github.com"})
testANPWithFQDN(t, data, "anp-vmagent-fqdn", namespace, *appliedToVM, []string{"docs.amazon.com"}, []string{"docs.google.com"}, []string{"github.com"})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be done in a separate PR? I don't feel very strongly about it, so if you want to keep it in this PR, it is fine by me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just to ensure that the VM e2e test can pass. I found that Facebook's URL is blocked in our new infrastructure...

* Upgraded Windows images to restore CI on the latest kubernetes cluster
* Updated external node e2e test and jenkins script for testbed migration.

Signed-off-by: Shuyang Xin <[email protected]>
ctr -n k8s.io image import $TEMP_ANTREA_TAR
IP=$(kubectl get nodes --selector=node-role.kubernetes.io/control-plane= --no-headers=true -o custom-columns=IP:.status.addresses[0].address)
rsync -avr --progress --inplace -e "ssh -o StrictHostKeyChecking=no" $TEMP_ANTREA_TAR jenkins@${IP}:${WORKDIR}/$TEMP_ANTREA_TAR
ssh -o StrictHostKeyChecking=no -n jenkins@${IP} "crictl rmi --prune; crictl ps --state Exited; ctr -n=k8s.io images import ${WORKDIR}/$TEMP_ANTREA_TAR" || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we only copy / import the image on the control plane Node, which I think is consistent with what we were doing before.
But in this case, how does the image become available to the other (worker) Nodes? Is there a different step where worker Nodes are added and we load the required images?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants