Skip to content

Conversation

@behlendorf
Copy link
Contributor

Motivation and Context

Keep the CI green. We see this occasional false positive in the CI which may distract from other real issues.

https://github.com/openzfs/zfs/actions/runs/19481983821/job/55755445615?pr=17941

I'm not thrilled about the need for this, but given that the test itself isn't entirely deterministic a retry seems like a reasonable compromise.

Description

While not common the draid3 vdev type has been observed to not always sit out a vdev when run in the CI. To prevent continued false positives allow the test to be retried up to three times before considering it a failure.

How Has This Been Tested?

Locally tested by manually injecting random percentage of failed sit outs to test to retry code.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@behlendorf behlendorf requested a review from pcd1193182 December 2, 2025 01:17
@behlendorf behlendorf added Component: Test Suite Indicates an issue with the test framework or a test case Status: Code Review Needed Ready for review and testing labels Dec 2, 2025
@behlendorf behlendorf requested a review from Copilot December 2, 2025 20:29
Copilot finished reviewing on behalf of behlendorf December 2, 2025 20:31
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds retry logic to the slow_vdev_degraded_sit_out ZFS test to handle non-deterministic behavior observed in CI, particularly with draid3 vdev types that don't always trigger a sit-out condition.

Key changes:

  • Converts the test loop from a foreach-style iteration over raid types to an indexed loop to enable retries
  • Adds conditional retry logic (up to 3 attempts) when a vdev doesn't sit out as expected
  • Removes the unconditional assertion that sit_out must be "on" after I/O operations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

While not common the draid3 vdev type has been observed to
not always sit out a vdev when run in the CI.  To prevent
continued false positives allow the test to be retried up
to three times before considering it a failure.

Signed-off-by: Brian Behlendorf <[email protected]>
@behlendorf behlendorf force-pushed the zts-retry-degraded_sit_out branch from 20ed8e8 to ae04e43 Compare December 3, 2025 00:35
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Dec 3, 2025
@behlendorf behlendorf merged commit dfb0875 into openzfs:master Dec 4, 2025
25 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component: Test Suite Indicates an issue with the test framework or a test case Status: Accepted Ready to integrate (reviewed, tested)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants