-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement very simple advisory lock mechanism #664
base: master
Are you sure you want to change the base?
Conversation
e312871
to
531850c
Compare
glommio/src/io/glommio_file.rs
Outdated
"Not holding the lock!" | ||
); | ||
|
||
let source = self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I've noticed here is that the background thread dispatch is quite expensive (the syscall takes microseconds but the round-trip to the background thread takes >10ms on my 3.9 GhZ CPU & the background thread should be idle). While that's fine for acquiring the lock (can argue the latency is just the cost of acquisition), it's a bit unfortunate for unlock - not sure if I should do the funlock explicitly since try_take_last_clone_unlocking_guard
is much slower than drop + try_take_last_clone (except that pattern isn't quite as safe if you're trying to enforce a clean shutdown). I could be convinced that unlock should be a synchronous call - @glommer thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the main cpu busy? My suspicion is that you have a too high poll period, and if the main thread is busy, it will not know that it is completed until poll happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not very familiar with advisory locks in practice. Never used them, so it's hard for me to talk about the caveats. But it seems to me that:
- you have state in userspace that tracks whether the lock is held (which is great, btw)
- you will only try to acquire the lock in the kernel when you already know you will succeed.
If that's the case, maybe we can just call the blocking call? Blocking for microseconds is absolutely not a problem, especially if this is a lock (not called thousands a time per second). You'd want this to happen on the blocking thread if this is an operation that can take dozens of milliseconds, potentially.
Is there any case where we are concerning that locking (or unlocking) could take more than 1ms ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmarking it on my machine (which isn't the best but it's the tool I have without resorting to reading the Linux kernel source), it's ~220ns for lock + unlock call (lock & unlock seem equally expensive).
I originally had the blocking version also in this PR but that approach doesn't work because we can easily deadlock since we run out background threads to distribute work to. Probably no way to implement those without proper I/O uring support (or doing something silly like doing a try lock & then spawning a completely new thread just for waiting if it fails).
So yeah, I'll move these back onto the main thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, just want to confirm. While it's fine for local files, this may show problems for NFS/SMB shares of which I don't have any to really test since those will map to network I/O. Handle it via documentation or stick to the block approach just in case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could use fstatfs to determine if it's NFS/SMB & fall back to the blocking path in that case if we want "optimal" behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to assume NFS/SMB is an unlikely use case for DmaFile for now so leaving it out. Can always add a fstatfs call later to distinguish network FS from local at the cost of an extra syscall in the advisory lock path.
877bc4f
to
83261bf
Compare
Is this ready to go ? Looks good on a new review. In related news, I am considering removing the clippy tests from CI. I still love the idea, but we struggled a lot already trying to figure out a good way to pin versions to make sure it stays stable, yet it keeps breaking. Do you have any thoughts ? |
I thought it was but need to fix failing tests. I actually don't mind the clippy stuff in CI. Keeps the codebase consistent which outweighs the nitpickiness annoyance of it for me personally. Probably could do with a pre push hook that runs the required clippy lints and doctests before so that I don't churn so much on it though :) If we're running clippy on a floating version (eg nightly or latest stable) then we could just not do that and stick to running clippy only on pinned versions at the risk of having more work when bumping the minimum / maximum rust version we build against (+ having downstream projects get more complaints if they're running newer than the version we test against). All options come with trade offs 😃 |
yeah, the pre-hooks may work. I'll take a look at it later this week. |
BTW the failure here was just a normal compile error :). |
Blocking versions can't be implemented because they would require spawning a new thread to avoid jamming up the background thread used for other syscalls (the jamming up can generate a deadlock which is what testing found). The blocking version would require spawning a new thread with a standalone executor but that's horibbly expensive AND runs into DataDog#448.
What does this PR do?
Introduces very basic advisory lock integration. As I mentioned in the issue, going for a very simple non-fancy approach that only lets you acquire a single advisory lock per file entry.
Motivation
Implementing a database & would like to use a safe version of advisory locks to implement mutual exclusion for writers. It's not possible to implement advisory locking in pure safe code outside because the resulting usage becomes unsound very easily (the Rust advisory-lock crate pretends it's safe but I'd argue that crate is unsound because it doesn't guarantee the usage of the lock is correct within the context of the current process).
Related issues
#663
Additional Notes
#448 prevents us from implementing "blocking" versions that wait forever for the lock to become available. The reason is that it would require spawning a new thread which would leak memory (it would also be horribly slow & interact poorly with cancellation).
Checklist
[X] I have added unit tests to the code I am submitting
[X] My unit tests cover both failure and success scenarios
[] If applicable, I have discussed my architecture