Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a crash when attempting to read a block pointer with no valid DVAs #17078

Merged
merged 1 commit into from
Mar 13, 2025

Conversation

asomers
Copy link
Contributor

@asomers asomers commented Feb 20, 2025

Motivation and Context

Somehow ZFS wrote to my pool a block pointer that is neither embedded nor a hole, yet contains no DVAs with a non-zero asize. That should be impossible, but it happened. This PR will cause ZFS to return ECKSUM when attempting to read from that block pointer, rather than crashing. This PR will also fix zdb so it can display such block pointers. Finally, I believe that this PR will prevent such block pointers from being written to disk in the first place, by triggering a panic in zfs_blkptr_verify in the write path.

Description

  • Modify zfs_blkptr_verify to ensure that every block pointer is either embedded, hole, or has at least one valid DVA.
  • Modify zdb to display such block pointers rather than failing an assertion. They will displayed with no DVA shown.

How Has This Been Tested?

Tested on FreeBSD 15.0-CURRENT by trying to read from a suitably corrupted dataset.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@asomers asomers force-pushed the no-dvas branch 2 times, most recently from 8f15de1 to 00fb087 Compare February 20, 2025 19:09
@amotin amotin added the Status: Code Review Needed Ready for review and testing label Feb 21, 2025
@amotin
Copy link
Member

amotin commented Feb 25, 2025

@asomers Could you look on the remaining comment?

@asomers
Copy link
Contributor Author

asomers commented Mar 10, 2025

@amotin are there any other questions you have for me?

@amotin
Copy link
Member

amotin commented Mar 11, 2025

are there any other questions you have for me?

@asomers I closed one conversation for later, but another one is still not answered -- added comment there. And Github wants a rebase.

Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. And please squash and rebase the commits.

Now instead of crashing when attempting to read the corrupt block
pointer, ZFS will return ECKSUM, in a stack that looks like this:

```
none:set-error
zfs.ko`arc_read+0x1d82
zfs.ko`dbuf_read+0xa8c
zfs.ko`dmu_buf_hold_array_by_dnode+0x292
zfs.ko`dmu_read_uio_dnode+0x47
zfs.ko`zfs_read+0x2d5
zfs.ko`zfs_freebsd_read+0x7b
kernel`VOP_READ_APV+0xd0
kernel`vn_read+0x20e
kernel`vn_io_fault_doio+0x45
kernel`vn_io_fault1+0x15e
kernel`vn_io_fault+0x150
kernel`dofileread+0x80
kernel`sys_read+0xb7
kernel`amd64_syscall+0x424
kernel`0xffffffff810633cb
```

This patch should hopefully also prevent such corrupt block pointers
from being written to disk in the first place.

And in zdb, don't crash when printing a block pointer with no valid
DVAs.  If a block pointer isn't embedded yet doesn't have any valid
DVAs, that's a data corruption bug.  zdb should be able to handle the
situation gracefully.

Finally, remove an extra check for gang blocks in SNPRINTF_BLKPTR.  This
check, which compares the asizes of two different DVAs within the same
BP, was added by illumos-gate commit b24ab67[^1], and I can't understand
why.  It doesn't appear to do anything useful, so remove it.

Fixes		openzfs#17077
Sponsored by:	ConnectWise
Signed-off-by:	Alan Somers <[email protected]>

[^1]: illumos/illumos-gate@b24ab67
@asomers
Copy link
Contributor Author

asomers commented Mar 12, 2025

Thank you. And please squash and rebase the commits.

done.

@amotin amotin added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Mar 12, 2025
@github-actions github-actions bot removed the Status: Accepted Ready to integrate (reviewed, tested) label Mar 12, 2025
@amotin amotin added the Status: Accepted Ready to integrate (reviewed, tested) label Mar 13, 2025
@amotin amotin merged commit 0433523 into openzfs:master Mar 13, 2025
20 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants