[Flaky Test] CAPM3 deployment is not ready #1851

mboukhalfa · 2024-07-19T06:54:08Z

Which jobs are flaking?

Which tests are flaking?

metal3-periodic-e2e-clusterctl-upgrade-test-main When testing cluster upgrade from releases (v1.7=>current) in the STEP: [0] Upgrading providers to the latest version available

Since when has it been flaking?

First seen on Jul 10, 2024, 10:10:00 PM

Jenkins link

https://jenkins.nordix.org/view/Metal3%20Periodic/job/metal3-periodic-e2e-clusterctl-upgrade-test-main/77/consoleFull

Reason for failure (if possible)

Not sure just pasting the error logs

16:51:00    < Exit [AfterEach] When testing cluster upgrade from releases (v1.7=>current) [clusterctl-upgrade] @ 07/18/24 13:50:54.088 (5m1.138s)
16:51:00  • [FAILED] [3413.643 seconds]
16:51:00  When testing cluster upgrade from releases (v1.7=>current) [clusterctl-upgrade] [It] Should create a management cluster and then upgrade all the providers
16:51:00  /home/metal3ci/go/pkg/mod/sigs.k8s.io/cluster-api/[email protected]/e2e/clusterctl_upgrade.go:253
16:51:00  
16:51:00    [FAILED] failed to run clusterctl upgrade
16:51:00    Unexpected error:
16:51:00        <*errors.withStack | 0xc001bb3848>: 
16:51:00        deployment "capm3-controller-manager" is not ready after 5m0s: failed to connect to the management cluster: action failed after 0 attempts: context deadline exceeded
16:51:00        {
16:51:00            error: <*errors.withMessage | 0xc001fecde0>{
16:51:00                cause: <*errors.withStack | 0xc001bb3818>{
16:51:00                    error: <*errors.withMessage | 0xc001fecdc0>{
16:51:00                        cause: <*errors.withStack | 0xc001bb37e8>{
16:51:00                            error: <*errors.withMessage | 0xc001fecda0>{
16:51:00                                cause: <context.deadlineExceededError>{},
16:51:00                                msg: "action failed after 0 attempts",
16:51:00                            },
16:51:00                            stack: [0x3113ac5, 0x3132326, 0x31198b0, 0x24afe72, 0x24afccd, 0x24b0185, 0x31196f3, 0x3119673, 0x3119507, 0x314108b, 0x313ec45, 0x315aa8d, 0x326dc48, 0x3272d08, 0x34d627a, 0x192c193, 0x194036d, 0x148fe21],
16:51:00                        },
16:51:00                        msg: "failed to connect to the management cluster",
16:51:00                    },
16:51:00                    stack: [0x313233c, 0x31198b0, 0x24afe72, 0x24afccd, 0x24b0185, 0x31196f3, 0x3119673, 0x3119507, 0x314108b, 0x313ec45, 0x315aa8d, 0x326dc48, 0x3272d08, 0x34d627a, 0x192c193, 0x194036d, 0x148fe21],
16:51:00                },
16:51:00                msg: "deployment \"capm3-controller-manager\" is not ready after 5m0s",
16:51:00            },
16:51:00            stack: [0x31197e8, 0x3119507, 0x314108b, 0x313ec45, 0x315aa8d, 0x326dc48, 0x3272d08, 0x34d627a, 0x192c193, 0x194036d, 0x148fe21],
16:51:00        }
16:51:00    occurred
16:51:00    In [It] at: /home/metal3ci/go/pkg/mod/sigs.k8s.io/cluster-api/[email protected]/framework/clusterctl/client.go:202 @ 07/18/24 13:35:57.588
16:51:00  
16:51:00    Full Stack Trace
16:51:00      sigs.k8s.io/cluster-api/test/framework/clusterctl.Upgrade({_, _}, {{0xc000de2300, 0x53}, {0xc00063a35d, 0x47}, 0x0, {0xc001a7a300, 0x24}, {0xc00193b640, ...}, ...})
16:51:00      	/home/metal3ci/go/pkg/mod/sigs.k8s.io/cluster-api/[email protected]/framework/clusterctl/client.go:202 +0x63d
16:51:00      sigs.k8s.io/cluster-api/test/framework/clusterctl.UpgradeManagementClusterAndWait({_, _}, {{0x41f4178, 0xc000fb1560}, {0xc00063a35d, 0x47}, 0x0, {0x3d27c5b, 0x7}, {0x0, ...}, ...}, ...)
16:51:00      	/home/metal3ci/go/pkg/mod/sigs.k8s.io/cluster-api/[email protected]/framework/clusterctl/clusterctl_helpers.go:212 +0x9e8
16:51:00      sigs.k8s.io/cluster-api/test/e2e.ClusterctlUpgradeSpec.func2()
16:51:00      	/home/metal3ci/go/pkg/mod/sigs.k8s.io/cluster-api/[email protected]/e2e/clusterctl_upgrade.go:582 +0x38ba

Anything else we need to know?

clusterctl was failing because of other issue the got fixed recently and this flake can be also because of infra also it is frequent

Label(s) to be applied

/kind flake

The text was updated successfully, but these errors were encountered:

mboukhalfa · 2024-07-19T06:54:40Z

triage/accepted

metal3-io-bot · 2024-10-20T17:47:01Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues will close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle stale

mquhuy · 2024-10-21T09:08:12Z

@mboukhalfa is this still an issue?

mboukhalfa · 2024-10-21T09:24:01Z

seems no the clusterctl did not fail for long

adilGhaffarDev · 2024-11-05T10:25:00Z

we are not seeing this flake anymore. Closing this issue.

metal3-io-bot added kind/flake Categorizes issue or PR as related to a flaky test. needs-triage Indicates an issue lacks a `triage/foo` label and requires one. labels Jul 19, 2024

mboukhalfa added the triage/accepted Indicates an issue is ready to be actively worked on. label Jul 22, 2024

metal3-io-bot removed the needs-triage Indicates an issue lacks a `triage/foo` label and requires one. label Jul 22, 2024

metal3-io deleted a comment from metal3-io-bot Jul 22, 2024

metal3-io-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 20, 2024

adilGhaffarDev closed this as completed Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Flaky Test] CAPM3 deployment is not ready #1851

[Flaky Test] CAPM3 deployment is not ready #1851

mboukhalfa commented Jul 19, 2024

mboukhalfa commented Jul 19, 2024

metal3-io-bot commented Oct 20, 2024

mquhuy commented Oct 21, 2024

mboukhalfa commented Oct 21, 2024

adilGhaffarDev commented Nov 5, 2024

[Flaky Test] CAPM3 deployment is not ready #1851

[Flaky Test] CAPM3 deployment is not ready #1851

Comments

mboukhalfa commented Jul 19, 2024

Which jobs are flaking?

Which tests are flaking?

Since when has it been flaking?

Jenkins link

Reason for failure (if possible)

Anything else we need to know?

Label(s) to be applied

mboukhalfa commented Jul 19, 2024

metal3-io-bot commented Oct 20, 2024

mquhuy commented Oct 21, 2024

mboukhalfa commented Oct 21, 2024

adilGhaffarDev commented Nov 5, 2024