-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
linux: zvols: correctly detect flush requests #17131
linux: zvols: correctly detect flush requests #17131
Conversation
it would be great if this could be backported to 2.3.x (and 2.2.x, in case new releases will be cut there as well) once reviewed and merged into master! tagging @robn @tonyhutter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I can see, modern kernel define both REQ_OP_FLUSH and REQ_PREFLUSH, so it is not obvious to me which one is right these days. May be checking both the same as before in HAVE_REQ_OP_FLUSH case would be safer? If nothing else, the comment above the function is inconsistent with the proposed content.
yes, the comment already got broken by an earlier kernel compat bump.. I tested both the variant for 4.10+ (just checking PREFLUSH, as it ended up in the final commit) and the one before (checking either OP or the flag), both seem to work. the kernel's |
diff --git a/include/os/linux/kernel/linux/blkdev_compat.h b/include/os/linux/kernel/linux/blkdev_compat.h
index d96708c60..8b040f4d6 100644
--- a/include/os/linux/kernel/linux/blkdev_compat.h
+++ b/include/os/linux/kernel/linux/blkdev_compat.h
@@ -383,7 +383,7 @@ bio_set_flush(struct bio *bio)
static inline boolean_t
bio_is_flush(struct bio *bio)
{
- return (bio_op(bio) == REQ_OP_FLUSH);
+ return (op_is_flush(bio->bi_opf));
}
/* I can't seem to find the equivalent |
3f2242e
to
c9ae14e
Compare
can confirm that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fabian-Gruenbichler this looks good. If you squash the commits it should be ready to merge.
since 4.10, bio->bi_opf needs to be checked to determine all kinds of flush requests. this was the case prior to the commit referenced below, but the order of ifdefs was not the usual one (newest up top), which might have caused this to slip through. this fixes a regression when using zvols as Qemu block devices, but might have broken other use cases as well. the symptoms are that all sync writes from within a VM configured to use such a virtual block devices are ignored and treated as async writes by the host ZFS layer. this can be verified using fio in sync mode inside the VM, for example with fio \ --filename=/dev/sda --ioengine=libaio --loops=1 --size=10G \ --time_based --runtime=60 --group_reporting --stonewall --name=cc1 \ --description="CC1" --rw=write --bs=4k --direct=1 --iodepth=1 \ --numjobs=1 --sync=1 which shows an IOPS number way above what the physical device underneath supports, with "zpool iostat -r 1" on the hypervisor side showing no sync IO occuring during the benchmark. with the regression fixed, both fio inside the VM and the IO stats on the host show the expected numbers. Fixes: 846b598 "config: remove HAVE_REQ_OP_* and HAVE_REQ_*" Co-authored-by: Alexander Motin <[email protected]> Signed-off-by: Fabian-Gruenbichler <[email protected]>
ee17324
to
644a7a7
Compare
done! left the co-authored-by that github added, the change is trivial either way :) thanks for the speedy review |
since 4.10, bio->bi_opf needs to be checked to determine all kinds of flush requests. this was the case prior to the commit referenced below, but the order of ifdefs was not the usual one (newest up top), which might have caused this to slip through. this fixes a regression when using zvols as Qemu block devices, but might have broken other use cases as well. the symptoms are that all sync writes from within a VM configured to use such a virtual block devices are ignored and treated as async writes by the host ZFS layer. this can be verified using fio in sync mode inside the VM, for example with fio \ --filename=/dev/sda --ioengine=libaio --loops=1 --size=10G \ --time_based --runtime=60 --group_reporting --stonewall --name=cc1 \ --description="CC1" --rw=write --bs=4k --direct=1 --iodepth=1 \ --numjobs=1 --sync=1 which shows an IOPS number way above what the physical device underneath supports, with "zpool iostat -r 1" on the hypervisor side showing no sync IO occuring during the benchmark. with the regression fixed, both fio inside the VM and the IO stats on the host show the expected numbers. Fixes: 846b598 "config: remove HAVE_REQ_OP_* and HAVE_REQ_*" Signed-off-by: Fabian-Gruenbichler <[email protected]> Co-authored-by: Alexander Motin <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
since 4.10, bio->bi_opf needs to be checked to determine all kinds of flush requests. this was the case prior to the commit referenced below, but the order of ifdefs was not the usual one (newest up top), which might have caused this to slip through. this fixes a regression when using zvols as Qemu block devices, but might have broken other use cases as well. the symptoms are that all sync writes from within a VM configured to use such a virtual block devices are ignored and treated as async writes by the host ZFS layer. this can be verified using fio in sync mode inside the VM, for example with fio \ --filename=/dev/sda --ioengine=libaio --loops=1 --size=10G \ --time_based --runtime=60 --group_reporting --stonewall --name=cc1 \ --description="CC1" --rw=write --bs=4k --direct=1 --iodepth=1 \ --numjobs=1 --sync=1 which shows an IOPS number way above what the physical device underneath supports, with "zpool iostat -r 1" on the hypervisor side showing no sync IO occuring during the benchmark. with the regression fixed, both fio inside the VM and the IO stats on the host show the expected numbers. Fixes: 846b598 "config: remove HAVE_REQ_OP_* and HAVE_REQ_*" Signed-off-by: Fabian-Gruenbichler <[email protected]> Co-authored-by: Alexander Motin <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
since 4.10, bio->bi_opf needs to be checked to determine all kinds of flush requests. this was the case prior to the commit referenced below, but the order of ifdefs was not the usual one (newest up top), which might have caused this to slip through. this fixes a regression when using zvols as Qemu block devices, but might have broken other use cases as well. the symptoms are that all sync writes from within a VM configured to use such a virtual block devices are ignored and treated as async writes by the host ZFS layer. this can be verified using fio in sync mode inside the VM, for example with fio \ --filename=/dev/sda --ioengine=libaio --loops=1 --size=10G \ --time_based --runtime=60 --group_reporting --stonewall --name=cc1 \ --description="CC1" --rw=write --bs=4k --direct=1 --iodepth=1 \ --numjobs=1 --sync=1 which shows an IOPS number way above what the physical device underneath supports, with "zpool iostat -r 1" on the hypervisor side showing no sync IO occuring during the benchmark. with the regression fixed, both fio inside the VM and the IO stats on the host show the expected numbers. Fixes: 846b598 "config: remove HAVE_REQ_OP_* and HAVE_REQ_*" Signed-off-by: Fabian-Gruenbichler <[email protected]> Co-authored-by: Alexander Motin <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
NAS-134891 / 25.10 / linux: zvols: correctly detect flush requests (openzfs#17131)
NAS-134891 / 25.04.0 / linux: zvols: correctly detect flush requests (openzfs#17131)
Motivation and Context
a user reported that since ZFS 2.2.7, sync writes from a VM backed by a zvol are no longer honored as sync writes by ZFS. a bisect identified the commit introducing the regression (see below).
Description
since 4.10, bio->bi_opf needs to be checked to determine all kinds of flush requests. this was the case prior to the commit referenced below, but the order of ifdefs was not the usual one (newest up top), which might have caused this to slip through.
this fixes a regression when using zvols as Qemu block devices, but might have broken other use cases as well. the symptoms are that all sync writes from within a VM configured to use such a virtual block devices are ignored and treated as async writes by the host ZFS layer.
this can be verified using fio in sync mode inside the VM, for example with
fio --filename=/dev/sda --ioengine=libaio --loops=1 --size=10G --time_based --runtime=60 --group_reporting --stonewall --name=cc1 --description="CC1" --rw=write --bs=4k --direct=1 --iodepth=1 --numjobs=1 --sync=1
which shows an IOPS number way above what the physical device underneath supports, with "zpool iostat -r 1" on the hypervisor side showing no sync IO occuring during the benchmark.
with the regression fixed, both fio inside the VM and the IO stats on the host show the expected numbers.
Fixes: 846b598
How Has This Been Tested?
see above - configured a test VM with a disk backed by a zvol, bisected using the fio commandline above and verified the fix in the same fashion after identifying the culprit,
Types of changes
Checklist:
Signed-off-by
.