Skip to content

Fix 2 bugs in non-raw send with encryption #17340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 19, 2025

Conversation

gamanakis
Copy link
Contributor

@gamanakis gamanakis commented May 16, 2025

Motivation and Context

Closes #12014

Description

Bisecting identified the redacted send/receive as the source of the bug
for issue #12014. Specifically the call to
dsl_dataset_hold_obj(&fromds) has been replaced by
dsl_dataset_hold_obj_flags() which passes a DECRYPT flag and creates
a key mapping. A subsequent dsl_dataset_rele_flag(&fromds) is missing
and the key mapping is not cleared. This may be inadvertedly used, which
results in arc_untransform failing with ECKSUM in:
arc_untransform+0x96/0xb0 [zfs]
dbuf_read_verify_dnode_crypt+0x196/0x350 [zfs]
dbuf_read+0x56/0x770 [zfs]
dmu_buf_hold_by_dnode+0x4a/0x80 [zfs]
zap_lockdir+0x87/0xf0 [zfs]
zap_lookup_norm+0x5c/0xd0 [zfs]
zap_lookup+0x16/0x20 [zfs]
zfs_get_zplprop+0x8d/0x1d0 [zfs]
setup_featureflags+0x267/0x2e0 [zfs]
dmu_send_impl+0xe7/0xcb0 [zfs]
dmu_send_obj+0x265/0x360 [zfs]
zfs_ioc_send+0x10c/0x280 [zfs]

Fix this by restoring the call to dsl_dataset_hold_obj().

The same applies for to_ds: here replace dsl_dataset_rele(&to_ds) with
dsl_dataset_rele_flags().

Both leaked key mappings will cause a panic when exporting the
sending pool or unloading the zfs module after a non-raw send from
an encrypted filesystem.

How Has This Been Tested?

Manually running the scripts https://github.com/HankB/provoke_ZFS_corruption

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@github-actions github-actions bot added the Status: Work in Progress Not yet ready for general review label May 16, 2025
@gamanakis
Copy link
Contributor Author

gamanakis commented May 16, 2025

How can I acknowledge @HankB and @pcd1193182 for the significant effort they put into this?

Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. Though after searching around this area for couple hours I still have no idea why it helps with the corruptions. I think it is only a trigger, but for what?

@tonyhutter
Copy link
Contributor

How can I acknowledge @HankB and @pcd1193182 for the significant effort they put into this?

Contributions-by: would be my recommendation. I see some of those in the git logs.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fixes here make sense. But I agree, it's not at all clear to me how this would lead to a corruption. Was the final reproducer for this something which could be reasonably adapted and included in the test suite?

@ryao
Copy link
Contributor

ryao commented May 17, 2025

The fixes here make sense. But I agree, it's not at all clear to me how this would lead to a corruption. Was the final reproducer for this something which could be reasonably adapted and included in the test suite?

I have a WIP CodeQL check for instances of this problem in my git repository:

https://github.com/ryao/zfs/tree/issue-12014

I am waiting to see if it works. If it does, we should think about adding checks for other functions that could be accidentally mismatched.

@ryao
Copy link
Contributor

ryao commented May 17, 2025

ryao/zfs@7c921e3 works and detected the two issues fixed in this patch. The current iteration is limited to cases where the mismatched hold and release functions are called from the same function. I might try to handle more cases before I open a PR with the check.

@ryao
Copy link
Contributor

ryao commented May 17, 2025

The current iteration is limited to cases where the mismatched hold and release functions are called from the same function. I might try to handle more cases before I open a PR with the check.

I just did a manual search to see if this is even necessary. It turns out that it is. dmu_objset_hold_flags() calls dsl_dataset_hold_flags(), which calls dsl_dataset_hold_obj_flags(). The returned dsl_dataset_t pointer is then passed to dsl_dataset_rele(), which is wrong, on error from dmu_objset_from_ds() in dmu_objset_hold_flags(). If I expand the power of the check to catch mismatched hold and release functions across function calls, the check should be able to identify this and possibly other bugs.

@lzsaver lzsaver mentioned this pull request May 17, 2025
13 tasks
@gamanakis
Copy link
Contributor Author

@ryao thank you for taking this to the next level.
@behlendorf I think expanding the ZTS would be great, but the scripts take about 2-3h to reproduce this.

@gamanakis
Copy link
Contributor Author

gamanakis commented May 17, 2025

I believe the crucial issue in #12014 is the failure to clear key mappings, which is not currently handled in the code base. My suspicion is that the apparent corruption in ARC stems from a scenario where an object (snapshot) is freed, but its key mapping persists. In the current code base this happens for both fromds and to_ds. This suggests the stale key mappings are inadvertently reused.

In this context, it would likely be safe to use dsl_dataset_hold_obj_flags(&fromds) as long as we call dsl_dataset_rele_flags(&fromds) afterwards.

@HankB
Copy link

HankB commented May 18, 2025

Was the final reproducer for this something which could be reasonably adapted and included in the test suite?

I'll comment because I wrote the reproducer. (Unfortunately I remain blissfully ignorant of the test suite and how it is run.)

I produced a series of scripts that:

  • Populate a pool with compressible and random files in multiple nested datasets.
  • Modify files, create and delete snapshots and send the encrypted pool to an unencrypted pool. (IOW, thrash on the pool.)

The setup of the tests is manual (copying and pasting commands from previous test notes.) This could all be scripted, I just didn't do that.

On my H/W - ten year old server motherboard, two 500GB SATA SSDs - it takes about 2 hours typically to reproduce corruption. One problem with automating it is that the results are indeterminate. It's "whale on it to see if it breaks." If it typically breaks at 2 hours, how long insures with reasonable confidence that the problem does not exist? Some earlier testing took on the order of a day to provoke corruption but I was able to bring that down by increasing the activity on the pool. It's conceivable that other changes to the stack could result in reducing the chance of corruption, delaying onset past the typical two hours.

Perhaps the next thing would be to fully automate the setup and bring the existing scripts up to "production quality." Others have used the scripts and been kind not comment on the code quality (beyond submitting PRs for bugs that I had overlooked.) Whether the scripts could be included in a test suite is left to those more familiar with the test suite.

@gamanakis
Copy link
Contributor Author

62f125b: updated the commit message

@AndrewJDR
Copy link

I’m just an observer here, but i’m curious: Is the context and timing of the uncleared key mapping reusage understood well enough to design a synthetic test for the crash?

@owlshrimp
Copy link

I’m just an observer here, but i’m curious: Is the context and timing of the uncleared key mapping reusage understood well enough to design a synthetic test for the crash?

Or alternatively, can the theory be tested with some printf's or the like?

Bisecting identified the redacted send/receive as the source of the bug
for issue openzfs#12014. Specifically the call to
dsl_dataset_hold_obj(&fromds) has been replaced by
dsl_dataset_hold_obj_flags() which passes a DECRYPT flag and creates
a key mapping. A subsequent dsl_dataset_rele_flag(&fromds) is missing
and the key mapping is not cleared. This may be inadvertedly used, which
results in arc_untransform failing with ECKSUM in:
 arc_untransform+0x96/0xb0 [zfs]
 dbuf_read_verify_dnode_crypt+0x196/0x350 [zfs]
 dbuf_read+0x56/0x770 [zfs]
 dmu_buf_hold_by_dnode+0x4a/0x80 [zfs]
 zap_lockdir+0x87/0xf0 [zfs]
 zap_lookup_norm+0x5c/0xd0 [zfs]
 zap_lookup+0x16/0x20 [zfs]
 zfs_get_zplprop+0x8d/0x1d0 [zfs]
 setup_featureflags+0x267/0x2e0 [zfs]
 dmu_send_impl+0xe7/0xcb0 [zfs]
 dmu_send_obj+0x265/0x360 [zfs]
 zfs_ioc_send+0x10c/0x280 [zfs]

Fix this by restoring the call to dsl_dataset_hold_obj().

The same applies for to_ds: here replace dsl_dataset_rele(&to_ds) with
dsl_dataset_rele_flags().

Both leaked key mappings will cause a panic when exporting the
sending pool or unloading the zfs module after a non-raw send from
an encrypted filesystem.

Contributions-by: Hank Barta <[email protected]>
Contributions-by: Paul Dagnelie <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
@gamanakis
Copy link
Contributor Author

40567d0: updated the commit message, it's not really a panic but arc_untransform failing with ECKSUM.

The theory that the stale key mapping is reused is the only possible explanation I see. I have been trying to see if this is true but it's tricky with cmn_err(). In the current codebase all key mappings are leaked.

@amotin amotin marked this pull request as ready for review May 19, 2025 13:59
@amotin amotin added the Status: Accepted Ready to integrate (reviewed, tested) label May 19, 2025
@github-actions github-actions bot added Status: Code Review Needed Ready for review and testing and removed Status: Work in Progress Not yet ready for general review Status: Accepted Ready to integrate (reviewed, tested) labels May 19, 2025
@amotin amotin added the Status: Accepted Ready to integrate (reviewed, tested) label May 19, 2025
@behlendorf
Copy link
Contributor

@HankB thanks for the clarification on the reproducer. I figured it wouldn't be easy to add to the CI or George would have done it in this PR already, but I still wanted to ask!

@ryao the codeQL check would be a nice addition to help us catch future issues, even if it is limited to within the scope of a single calling function (for now). Plus having a working example should make it pretty easy to extend to other similar interfaces.

@behlendorf behlendorf merged commit ea74cde into openzfs:master May 19, 2025
28 checks passed
ryao added a commit to ryao/zfs that referenced this pull request May 19, 2025
This was caught when doing a manual check to see if openzfs#17352 needed to be
improved to catch mismatches across stack frames of the kind that were
first found in openzfs#17340.

Signed-off-by: Richard Yao <[email protected]>
gamanakis added a commit to gamanakis/zfs that referenced this pull request May 20, 2025
Bisecting identified the redacted send/receive as the source of the bug
for issue openzfs#12014. Specifically the call to
dsl_dataset_hold_obj(&fromds) has been replaced by
dsl_dataset_hold_obj_flags() which passes a DECRYPT flag and creates
a key mapping. A subsequent dsl_dataset_rele_flag(&fromds) is missing
and the key mapping is not cleared. This may be inadvertedly used, which
results in arc_untransform failing with ECKSUM in:
 arc_untransform+0x96/0xb0 [zfs]
 dbuf_read_verify_dnode_crypt+0x196/0x350 [zfs]
 dbuf_read+0x56/0x770 [zfs]
 dmu_buf_hold_by_dnode+0x4a/0x80 [zfs]
 zap_lockdir+0x87/0xf0 [zfs]
 zap_lookup_norm+0x5c/0xd0 [zfs]
 zap_lookup+0x16/0x20 [zfs]
 zfs_get_zplprop+0x8d/0x1d0 [zfs]
 setup_featureflags+0x267/0x2e0 [zfs]
 dmu_send_impl+0xe7/0xcb0 [zfs]
 dmu_send_obj+0x265/0x360 [zfs]
 zfs_ioc_send+0x10c/0x280 [zfs]

Fix this by restoring the call to dsl_dataset_hold_obj().

The same applies for to_ds: here replace dsl_dataset_rele(&to_ds) with
dsl_dataset_rele_flags().

Both leaked key mappings will cause a panic when exporting the
sending pool or unloading the zfs module after a non-raw send from
an encrypted filesystem.

Contributions-by: Hank Barta <[email protected]>
Contributions-by: Paul Dagnelie <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes openzfs#12014 
Closes openzfs#17340
behlendorf pushed a commit that referenced this pull request May 20, 2025
This was caught when doing a manual check to see if #17352 needed to be
improved to catch mismatches across stack frames of the kind that were
first found in #17340.

Reviewed-by: George Amanakis <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #17353
@robn
Copy link
Member

robn commented May 22, 2025

Has anyone done a full analysis to understand how this bug manifested as the myriad encryption bugs, panics and other quirks we've seen over the years?

Don't get me wrong; this PR is obviously right for what it is, and I won't be surprised if it closes a locking gap somewhere that could be hit if you're unlucky, but I'd like those dots connected so I can feel comfortable saying yes, this definitely sorts it out. Especially since the forums are starting to pick up on it and the high-fives and cheers are making me a bit uncomfortable.

(and yes, I am a blast at parties 🎉 😬)

If not, I've got some spare time this afternoon; I'll try to do the analysis myself.

@amotin
Copy link
Member

amotin commented May 22, 2025

@robn I don't think there is still a proper understanding of what's going on. At least I haven't seen and don't have one.

robn pushed a commit to robn/zfs that referenced this pull request May 23, 2025
Bisecting identified the redacted send/receive as the source of the bug
for issue openzfs#12014. Specifically the call to
dsl_dataset_hold_obj(&fromds) has been replaced by
dsl_dataset_hold_obj_flags() which passes a DECRYPT flag and creates
a key mapping. A subsequent dsl_dataset_rele_flag(&fromds) is missing
and the key mapping is not cleared. This may be inadvertedly used, which
results in arc_untransform failing with ECKSUM in:
 arc_untransform+0x96/0xb0 [zfs]
 dbuf_read_verify_dnode_crypt+0x196/0x350 [zfs]
 dbuf_read+0x56/0x770 [zfs]
 dmu_buf_hold_by_dnode+0x4a/0x80 [zfs]
 zap_lockdir+0x87/0xf0 [zfs]
 zap_lookup_norm+0x5c/0xd0 [zfs]
 zap_lookup+0x16/0x20 [zfs]
 zfs_get_zplprop+0x8d/0x1d0 [zfs]
 setup_featureflags+0x267/0x2e0 [zfs]
 dmu_send_impl+0xe7/0xcb0 [zfs]
 dmu_send_obj+0x265/0x360 [zfs]
 zfs_ioc_send+0x10c/0x280 [zfs]

Fix this by restoring the call to dsl_dataset_hold_obj().

The same applies for to_ds: here replace dsl_dataset_rele(&to_ds) with
dsl_dataset_rele_flags().

Both leaked key mappings will cause a panic when exporting the
sending pool or unloading the zfs module after a non-raw send from
an encrypted filesystem.

Contributions-by: Hank Barta <[email protected]>
Contributions-by: Paul Dagnelie <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes openzfs#12014
Closes openzfs#17340
(cherry picked from commit ea74cde)
robn pushed a commit to robn/zfs that referenced this pull request May 23, 2025
This was caught when doing a manual check to see if openzfs#17352 needed to be
improved to catch mismatches across stack frames of the kind that were
first found in openzfs#17340.

Reviewed-by: George Amanakis <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#17353
(cherry picked from commit 83fa80a)
(cherry picked from commit 07d815f)
robn pushed a commit to robn/zfs that referenced this pull request May 23, 2025
This was caught when doing a manual check to see if openzfs#17352 needed to be
improved to catch mismatches across stack frames of the kind that were
first found in openzfs#17340.

Reviewed-by: George Amanakis <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#17353
(cherry picked from commit 83fa80a)
(cherry picked from commit 07d815f2a06573514f51c8601aa60db6fe1a76ad)
robn pushed a commit to robn/zfs that referenced this pull request May 23, 2025
Bisecting identified the redacted send/receive as the source of the bug
for issue openzfs#12014. Specifically the call to
dsl_dataset_hold_obj(&fromds) has been replaced by
dsl_dataset_hold_obj_flags() which passes a DECRYPT flag and creates
a key mapping. A subsequent dsl_dataset_rele_flag(&fromds) is missing
and the key mapping is not cleared. This may be inadvertedly used, which
results in arc_untransform failing with ECKSUM in:
 arc_untransform+0x96/0xb0 [zfs]
 dbuf_read_verify_dnode_crypt+0x196/0x350 [zfs]
 dbuf_read+0x56/0x770 [zfs]
 dmu_buf_hold_by_dnode+0x4a/0x80 [zfs]
 zap_lockdir+0x87/0xf0 [zfs]
 zap_lookup_norm+0x5c/0xd0 [zfs]
 zap_lookup+0x16/0x20 [zfs]
 zfs_get_zplprop+0x8d/0x1d0 [zfs]
 setup_featureflags+0x267/0x2e0 [zfs]
 dmu_send_impl+0xe7/0xcb0 [zfs]
 dmu_send_obj+0x265/0x360 [zfs]
 zfs_ioc_send+0x10c/0x280 [zfs]

Fix this by restoring the call to dsl_dataset_hold_obj().

The same applies for to_ds: here replace dsl_dataset_rele(&to_ds) with
dsl_dataset_rele_flags().

Both leaked key mappings will cause a panic when exporting the
sending pool or unloading the zfs module after a non-raw send from
an encrypted filesystem.

Contributions-by: Hank Barta <[email protected]>
Contributions-by: Paul Dagnelie <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes openzfs#12014
Closes openzfs#17340
(cherry picked from commit ea74cde)
robn pushed a commit to robn/zfs that referenced this pull request May 23, 2025
This was caught when doing a manual check to see if openzfs#17352 needed to be
improved to catch mismatches across stack frames of the kind that were
first found in openzfs#17340.

Reviewed-by: George Amanakis <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#17353
(cherry picked from commit 83fa80a)
@robn robn mentioned this pull request May 23, 2025
14 tasks
robn pushed a commit to robn/zfs that referenced this pull request May 24, 2025
Bisecting identified the redacted send/receive as the source of the bug
for issue openzfs#12014. Specifically the call to
dsl_dataset_hold_obj(&fromds) has been replaced by
dsl_dataset_hold_obj_flags() which passes a DECRYPT flag and creates
a key mapping. A subsequent dsl_dataset_rele_flag(&fromds) is missing
and the key mapping is not cleared. This may be inadvertedly used, which
results in arc_untransform failing with ECKSUM in:
 arc_untransform+0x96/0xb0 [zfs]
 dbuf_read_verify_dnode_crypt+0x196/0x350 [zfs]
 dbuf_read+0x56/0x770 [zfs]
 dmu_buf_hold_by_dnode+0x4a/0x80 [zfs]
 zap_lockdir+0x87/0xf0 [zfs]
 zap_lookup_norm+0x5c/0xd0 [zfs]
 zap_lookup+0x16/0x20 [zfs]
 zfs_get_zplprop+0x8d/0x1d0 [zfs]
 setup_featureflags+0x267/0x2e0 [zfs]
 dmu_send_impl+0xe7/0xcb0 [zfs]
 dmu_send_obj+0x265/0x360 [zfs]
 zfs_ioc_send+0x10c/0x280 [zfs]

Fix this by restoring the call to dsl_dataset_hold_obj().

The same applies for to_ds: here replace dsl_dataset_rele(&to_ds) with
dsl_dataset_rele_flags().

Both leaked key mappings will cause a panic when exporting the
sending pool or unloading the zfs module after a non-raw send from
an encrypted filesystem.

Contributions-by: Hank Barta <[email protected]>
Contributions-by: Paul Dagnelie <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes openzfs#12014
Closes openzfs#17340
(cherry picked from commit ea74cde)
robn pushed a commit to robn/zfs that referenced this pull request May 24, 2025
This was caught when doing a manual check to see if openzfs#17352 needed to be
improved to catch mismatches across stack frames of the kind that were
first found in openzfs#17340.

Reviewed-by: George Amanakis <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#17353
(cherry picked from commit 83fa80a)
tonyhutter pushed a commit that referenced this pull request May 27, 2025
Bisecting identified the redacted send/receive as the source of the bug
for issue #12014. Specifically the call to
dsl_dataset_hold_obj(&fromds) has been replaced by
dsl_dataset_hold_obj_flags() which passes a DECRYPT flag and creates
a key mapping. A subsequent dsl_dataset_rele_flag(&fromds) is missing
and the key mapping is not cleared. This may be inadvertedly used, which
results in arc_untransform failing with ECKSUM in:
 arc_untransform+0x96/0xb0 [zfs]
 dbuf_read_verify_dnode_crypt+0x196/0x350 [zfs]
 dbuf_read+0x56/0x770 [zfs]
 dmu_buf_hold_by_dnode+0x4a/0x80 [zfs]
 zap_lockdir+0x87/0xf0 [zfs]
 zap_lookup_norm+0x5c/0xd0 [zfs]
 zap_lookup+0x16/0x20 [zfs]
 zfs_get_zplprop+0x8d/0x1d0 [zfs]
 setup_featureflags+0x267/0x2e0 [zfs]
 dmu_send_impl+0xe7/0xcb0 [zfs]
 dmu_send_obj+0x265/0x360 [zfs]
 zfs_ioc_send+0x10c/0x280 [zfs]

Fix this by restoring the call to dsl_dataset_hold_obj().

The same applies for to_ds: here replace dsl_dataset_rele(&to_ds) with
dsl_dataset_rele_flags().

Both leaked key mappings will cause a panic when exporting the
sending pool or unloading the zfs module after a non-raw send from
an encrypted filesystem.

Contributions-by: Hank Barta <[email protected]>
Contributions-by: Paul Dagnelie <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #12014
Closes #17340
(cherry picked from commit ea74cde)
tonyhutter pushed a commit that referenced this pull request May 27, 2025
This was caught when doing a manual check to see if #17352 needed to be
improved to catch mismatches across stack frames of the kind that were
first found in #17340.

Reviewed-by: George Amanakis <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #17353
(cherry picked from commit 83fa80a)
(cherry picked from commit 07d815f)
tonyhutter pushed a commit that referenced this pull request May 28, 2025
Bisecting identified the redacted send/receive as the source of the bug
for issue #12014. Specifically the call to
dsl_dataset_hold_obj(&fromds) has been replaced by
dsl_dataset_hold_obj_flags() which passes a DECRYPT flag and creates
a key mapping. A subsequent dsl_dataset_rele_flag(&fromds) is missing
and the key mapping is not cleared. This may be inadvertedly used, which
results in arc_untransform failing with ECKSUM in:
 arc_untransform+0x96/0xb0 [zfs]
 dbuf_read_verify_dnode_crypt+0x196/0x350 [zfs]
 dbuf_read+0x56/0x770 [zfs]
 dmu_buf_hold_by_dnode+0x4a/0x80 [zfs]
 zap_lockdir+0x87/0xf0 [zfs]
 zap_lookup_norm+0x5c/0xd0 [zfs]
 zap_lookup+0x16/0x20 [zfs]
 zfs_get_zplprop+0x8d/0x1d0 [zfs]
 setup_featureflags+0x267/0x2e0 [zfs]
 dmu_send_impl+0xe7/0xcb0 [zfs]
 dmu_send_obj+0x265/0x360 [zfs]
 zfs_ioc_send+0x10c/0x280 [zfs]

Fix this by restoring the call to dsl_dataset_hold_obj().

The same applies for to_ds: here replace dsl_dataset_rele(&to_ds) with
dsl_dataset_rele_flags().

Both leaked key mappings will cause a panic when exporting the
sending pool or unloading the zfs module after a non-raw send from
an encrypted filesystem.

Contributions-by: Hank Barta <[email protected]>
Contributions-by: Paul Dagnelie <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #12014
Closes #17340
(cherry picked from commit ea74cde)
tonyhutter pushed a commit that referenced this pull request May 28, 2025
This was caught when doing a manual check to see if #17352 needed to be
improved to catch mismatches across stack frames of the kind that were
first found in #17340.

Reviewed-by: George Amanakis <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #17353
(cherry picked from commit 83fa80a)
pcd1193182 pushed a commit to KlaraSystems/zfs that referenced this pull request Jun 2, 2025
Bisecting identified the redacted send/receive as the source of the bug
for issue openzfs#12014. Specifically the call to
dsl_dataset_hold_obj(&fromds) has been replaced by
dsl_dataset_hold_obj_flags() which passes a DECRYPT flag and creates
a key mapping. A subsequent dsl_dataset_rele_flag(&fromds) is missing
and the key mapping is not cleared. This may be inadvertedly used, which
results in arc_untransform failing with ECKSUM in:
 arc_untransform+0x96/0xb0 [zfs]
 dbuf_read_verify_dnode_crypt+0x196/0x350 [zfs]
 dbuf_read+0x56/0x770 [zfs]
 dmu_buf_hold_by_dnode+0x4a/0x80 [zfs]
 zap_lockdir+0x87/0xf0 [zfs]
 zap_lookup_norm+0x5c/0xd0 [zfs]
 zap_lookup+0x16/0x20 [zfs]
 zfs_get_zplprop+0x8d/0x1d0 [zfs]
 setup_featureflags+0x267/0x2e0 [zfs]
 dmu_send_impl+0xe7/0xcb0 [zfs]
 dmu_send_obj+0x265/0x360 [zfs]
 zfs_ioc_send+0x10c/0x280 [zfs]

Fix this by restoring the call to dsl_dataset_hold_obj().

The same applies for to_ds: here replace dsl_dataset_rele(&to_ds) with
dsl_dataset_rele_flags().

Both leaked key mappings will cause a panic when exporting the
sending pool or unloading the zfs module after a non-raw send from
an encrypted filesystem.

Contributions-by: Hank Barta <[email protected]>
Contributions-by: Paul Dagnelie <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes openzfs#12014 
Closes openzfs#17340
@almereyda almereyda mentioned this pull request Jun 13, 2025
14 tasks
@girgen
Copy link

girgen commented Jul 3, 2025

@robn I don't think there is still a proper understanding of what's going on. At least I haven't seen and don't have one.

Are you sure that this statement still stands? The description from the FreeBSD advisory [1] is pretty clear what happens:

II.  Problem Description

ZFS has built-in replication and backup functionality, which serializes a
filesystem for transport to another system, known as "ZFS send".  ZFS send
also supports incremental updates between a pair of snapshots.  When sending
an encrypted dataset, the dataset can either be left encrypted for
transit/receipt (raw mode), or decrypted.  During a decrypting (normal) send,
a bug in the code caused some metadata (key mappings) in the snapshots to be
decrypted in memory, but not properly released.  As a result, the key mappings
used for decryption were not freed from the in-memory table.

OK, the sentence

... they can result in spurious checksum errors when they are
incorrectly used to access data later.

is perhaps a bit vague, but that could be only due to the brevity of the text.

[1] https://www.freebsd.org/security/advisories/FreeBSD-EN-25:10.zfs.asc

@robn
Copy link
Member

robn commented Jul 3, 2025

I still don't think anyone has worked out the exact interactions and timing required between threads to trip it, but I think its understood enough. If you know enough of the code, it's fairly straightforward to draw a line from a leftover key mapping in memory to the various symptoms that have been reported over the years, and since those reports have dried up in the last few weeks, so I think it's safe to say "idk, seems plausible!".

It could probably be better understood, but there's loads of things like that, and they take whole hours to study. I'm content for now to stick this one on my reading list (tsundoku-style).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ZFS corruption related to snapshots post-2.0.x upgrade
10 participants