-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do "nova error" messages indicate a bug in NOVA? #110
Comments
That does sound like a bug. NOVA tries to free some blocks which are already marked as free. Tracking the allocation/free of the ranges should help to locate the bug. |
I just got around to taking a look at this issue. If my understanding of the free list management in NOVA's recovery is correct, I think the problem looks something like this at a high level:
At this point, the primary inode will claim that foo is using some blocks that are also in the free list. I have a workaround, but I don't think it's a great fix - I think it's slow and does more work than necessary. In |
Actually not need to explicitly call nova_iget and iput(). Instead, before going to recover the inode pages, we should first check the checksum and recover the inode, if metadata_csum is enabled. There is a FIXME at line 1331 of bbuild.c, which should be the place to add the check. |
That makes sense - my workaround felt pretty janky :) Unfortunately I'm having trouble getting that solution to work. It seems as though running the checksum verification and inode repair in |
Hi,
I'll go into some details below, but my main question is: is it ever expected that NOVA will produce an error via
nova_err
during normal execution/after a crash and successful recovery, or is such an error indicative of a bug? In experimenting with crash-testing NOVA I've been seeing some of these errors`, but the operations producing those errors do seem to complete successfully.Specifically, I'm running a program that performs the following operations in NOVA with
metadata_csum
turned on:I'm injecting crashes in the
truncate
and looking at various possible crash states that can arise.truncate
callsnova_notify_change
, which performs some updates on the affected inode and then updates the inode's checksum. I've found that when I inject a crash after the non-checksum fields have begun to be updated but before the checksum is updated, and I mount the resulting crash state, deleting A/foo produces the following logs but otherwise succeeds.I noticed this because on the version of Ubuntu I'm running NOVA with, the default log levels show the
nova error
message to the user on the terminal. Deleting A/foo DOES succeed (returns a success code and the file is actually deleted) but the error message being displayed to the user seems undesirable.Thanks!
The text was updated successfully, but these errors were encountered: