Skip to content

Conversation

tonyhutter
Copy link
Contributor

Motivation and Context

Print out the byte ranges of corruption in damaged files

Description

Add a -vv option to zpool status to print all error'd out block ranges:

$ zpool status -vv
...
    NAME        STATE     READ WRITE CKSUM
    testpool    ONLINE       0     0     0
      loop0     ONLINE       0     0    20

errors: Permanent errors have been detected in the following files:

    /var/tmp/testdir/10m_file: found 9 corrupted 128K blocks
       [0x0-0x1ffff] (128K)
       [0x100000-0x1fffff] (1M)

    /var/tmp/testdir/1m_file: found 1 corrupted 128K block
       [0x0-0x1ffff] (128K)

This is a modification of @TulsiJain's #8902 patch, with updates added to print out the different ranges of errors, and to bring the code up to date with master.

How Has This Been Tested?

Added test case

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@tonyhutter tonyhutter mentioned this pull request Dec 28, 2019
12 tasks
@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Dec 28, 2019
@tonyhutter tonyhutter force-pushed the error-ranges branch 2 times, most recently from 61964ab to 325ce73 Compare December 30, 2019 20:42
@codecov
Copy link

codecov bot commented Dec 31, 2019

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 79.66%. Comparing base (82e996c) to head (62b89fe).
⚠️ Report is 4520 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9781      +/-   ##
==========================================
+ Coverage   79.40%   79.66%   +0.26%     
==========================================
  Files         385      385              
  Lines      121481   121533      +52     
==========================================
+ Hits        96461    96821     +360     
+ Misses      25020    24712     -308     
Flag Coverage Δ
kernel 79.84% <100.00%> (+0.03%) ⬆️
user 67.43% <95.65%> (+0.61%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@behlendorf
Copy link
Contributor

Rebasing this on master will resolve the CI failures.

Add a '-vv' option to zpool status to print all error'd out blocks
ranges:

$ zpool status -vv
...
    NAME        STATE     READ WRITE CKSUM
    testpool    ONLINE       0     0     0
      loop0     ONLINE       0     0    20

errors: Permanent errors have been detected in the following files:

    /var/tmp/testdir/10m_file: found 9 corrupted 128K blocks
       [0x0-0x1ffff] (128K)
       [0x100000-0x1fffff] (1M)

    /var/tmp/testdir/1m_file: found 1 corrupted 128K block
       [0x0-0x1ffff] (128K)

This is a modification of @TulsiJain's openzfs#8902 patch, with updates added
to print out the different ranges of errors, and to bring the code up to
date with master.

Signed-off-by: Tony Hutter <[email protected]>
Co-authored-by: TulsiJain <[email protected]>
$(top_builddir)/lib/libuutil/libuutil.la \
$(top_builddir)/lib/libzfs/libzfs.la
$(top_builddir)/lib/libzfs/libzfs.la \
$(top_builddir)/lib/libzpool/libzpool.la
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to avoid linking libzpool.la with any of the command line utilities. Since this was brought in solely for the btrees and range trees perhaps we can refactor those in a user space convenience library so they can be used. Alternately, we'll need to merge the bookmark_phys_t's in to ranges some other way.

int cb_namewidth;
unsigned int cb_verbose;
boolean_t cb_allpools;
boolean_t cb_verbose;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider making this the default behavior and always printing the ranges.

range_tree = range_tree_create(NULL,
RANGE_SEG64, NULL, 0, 0);
if (!range_tree)
goto fail;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding a label you can just break out of the while loop here.

sizeof (*indrt_levels));

/* Write our object group's objset and object */
VERIFY0(nvlist_add_uint64(nv, ZPOOL_ERR_DATASET, zb[0].zb_objset));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the user space only code please use verify() instead.

uint64_t zs_links;
uint64_t zs_ctime[2];
uint64_t zs_data_block_size;
uint64_t zs_indirect_block_size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While not needed by this patch I think it would be useful to add zs_dnode_size here as well.

Copy link
Contributor

@pcd1193182 pcd1193182 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also concur with @behlendorf's point that we shouldn't be linking in libzpool. Otherwise, it looks good to me.


verify(nvlist_alloc(nverrlistp, 0, KM_SLEEP) == 0);


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: double blank line

@ahrens
Copy link
Member

ahrens commented Apr 20, 2020

@tonyhutter, I wanted to check where you are with this. Is this something that someone else could/should help to get this completed?

@tonyhutter
Copy link
Contributor Author

@ahrens I'm pretty bandwidth limited right now, and it's a low priority fix for us. If someone wants to push this over the finish line, I'd be fine with that.

@behlendorf behlendorf mentioned this pull request Aug 22, 2025
14 tasks
@behlendorf
Copy link
Contributor

Replaced by #17502.

@behlendorf behlendorf closed this Aug 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Code Review Needed Ready for review and testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants