Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meta: create "loss of data" label #295

Open
Thesola10 opened this issue Nov 7, 2024 · 1 comment
Open

Meta: create "loss of data" label #295

Thesola10 opened this issue Nov 7, 2024 · 1 comment

Comments

@Thesola10
Copy link

I am currently managing a NixOS NAS at home and am looking to gauge the risk factor for bees. So far I haven't had data loss using it on my laptop, but while I do have backups, I can't afford to risk server data for space savings.

In the interest of trust and accountability, would it be possible to create a dedicated GitHub bug tracking label for data loss incidents related to bees? I believe this is warranted for a low-level filesystem tool.

Even if it ends up unused -- in fact, I sincerely wish for this label to go unused :p

@Zygo
Copy link
Owner

Zygo commented Nov 7, 2024

Sure, if it ever comes up. As far as I know, bees has never been the root cause of any data loss event. Do you have one to report?

btrfs has occasionally had kernel releases with data-losing bugs. Running any software which modifies the filesystem on such a kernel can cause data loss. Some of these bugs are documented on the kernel bugs table but they can affect many applications, not just bees. For example, one bug on the bugs list carries a small but non-zero risk of total filesystem data loss for every write operation involving btrfs--it's the one with the big data corruption warnings at the top of the page.

It would be difficult to maintain the accuracy and integrity of the label on a bees github issue. Even when there is a kernel bug that is triggered by one of the fixed set of operations that bees does (tree search, extent backref lookup, inode name lookup, file open, file stat, data read, data write, data write with compression, and deduplicate), those operations are fundamental to what bees does. The data loss risk assessment would apply only to the combination of bees with specific kernel versions, and no change in bees would add or remove the data loss risk from the combination of bees with a bad kernel version. Only changes to the kernel, not bees, can fix a kernel bug.

I don't think it's feasible or reasonable for the bees project to take on the responsibility of tracking all data-losing kernel bugs in btrfs--and especially not if the scope is expanded to cover related subsystems like sata disks or lvm which btrfs may depend on for data integrity. I make a nominal effort to test all mainline kernel releases with current bees versions (mostly to protect data directly in my care), and I update the published pages when I find new issues. I'm willing to republish bugs others have identified if I can confirm the issue. I'm never going to be a replacement for proper and timely kernel QA.

bees has some low-level knowledge of btrfs filesystem structure, but it only ever reads these structures. It is up to the kernel to perform all modifications recommended by bees--and the kernel can (and often does) reject these recommendations if they would alter data. There are a number of risk mitigation design features as well:

  • user data files are always opened O_RDONLY so that a bug causing a write to a wrong FD in the bees process will fail instead of overwriting user data
  • file stat is verified after opening to ensure the intended file was opened before performing any further operations
  • use of O_NOFOLLOW and O_TMPFILE to further reduce symlink and TOCTTOU attack surface (bees is a bit behind here, since kernel 5.6 there is openat2 which has stronger cross-device and symlink controls)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants