-
Notifications
You must be signed in to change notification settings - Fork 2k
[SH] add userfault support #5261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kalyazin
wants to merge
19
commits into
firecracker-microvm:feature/secret-hiding
Choose a base branch
from
kalyazin:sh_uf
base: feature/secret-hiding
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
[SH] add userfault support #5261
kalyazin
wants to merge
19
commits into
firecracker-microvm:feature/secret-hiding
from
kalyazin:sh_uf
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## feature/secret-hiding #5261 +/- ##
=========================================================
- Coverage 82.52% 81.66% -0.86%
=========================================================
Files 250 250
Lines 27386 27795 +409
=========================================================
+ Hits 22599 22698 +99
- Misses 4787 5097 +310
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
286efbe
to
4e10e54
Compare
b6185cb
to
60abeb9
Compare
d58a5d4
to
d5e7aa8
Compare
This is needed because if guest_memfd is used to back guest memory, vCPU fault notifications are delivered via the UFFD UDS socket. Signed-off-by: Nikita Kalyazin <[email protected]>
Example UFFD handlers are now reading from the UDS socket in a buffered way. This is to make it possible to read messages of different types in future commits to be able to handle fault request messages from Firecracker if Secret Freedom is enabled. Signed-off-by: Nikita Kalyazin <[email protected]>
It is used by Secret-Free-enabled UFFD handlers to disable vCPU fault notifications from the kernel. Signed-off-by: Nikita Kalyazin <[email protected]>
Accept receiving 3 fds instead of 1, where fds[1] is guest_memfd and fds[2] is userfault bitmap memfd. Also handle the FaultRequest message over the UDS socket by calling a new callback in the Runtime and sending a FaultReply. TODO: add cab/sob from Patrick Signed-off-by: Nikita Kalyazin <[email protected]>
There are two ways a UFFD handler receives a fault notification if Secret Fredom is enabled (which is inferred from 3 fds sent by Firecracker instead of 1): - a VMM- or KVM-triggered fault is delivered via a minor UFFD fault event. The handler is supposed to respond to it via memcpying the content of the page (if the page hasn't already been populated) followed by a UFFDIO_CONTINUE call. - a vCPU-triggered fault is delievered via a FaultRequest message on the UDS socket. The handler is supposed to reply with a pwrite64 call on the guest_memfd to populate the page followed by a FaultReply message on the UDS socket. In both cases, the handler also needs to clear the bit in the userfault bitmap at the corresponding offset in order to stop further fault notifications for the same page. UFFD handlers use the userfault bitmap for two purposes: - communicate to the kernel whether a fault at the corresponding guest_memfd offset will cause a VM exit - keep track of pages that have already been populated in order to avoid overwriting the content of the page that is already initialised. Signed-off-by: Nikita Kalyazin <[email protected]>
These are used for communication of page faults between Firecracker and a UFFD handler. Signed-off-by: Nikita Kalyazin <[email protected]>
If configured, userfault bitmap is registered with KVM and controls whether KVM will exit to userspace on a fault of the corresponding page. We are going to allocate the bitmap in a memfd in Firecracker, set bits for all pages to request notifications for vCPU faults and send it to the UFFD handler to delegate clearing the bits as pages get populated. Since the KVM userfault patches are still in review, set_user_memory_region2 is not aware of the userfault flag and the userfault bitmap address in its input structure. Define it in Firecracker code temporarily. Signed-off-by: Nikita Kalyazin <[email protected]>
This is needed to instruct the kernel to exit to userspace when a vCPU fault occurs and the corresponding bit in the userfault bitmap is set. The userfault bitmap is allocated in a memfd by Firecracker and sent to the UFFD handler. This also sends 3 fds to the UFFD handler in the handshake: - UFFD (original) - guest_memfd: for the handler to be able to populate guest memory - userfault bitmap memfd: for the handler to be able to disable exits to userspace for the pages that have already been populated Signed-off-by: Nikita Kalyazin <[email protected]>
These will be used to communicate vCPU faults between vCPUs and the VM if secret freedom is enabled. Signed-off-by: Nikita Kalyazin <[email protected]>
This is because vCPUs reason in GPAs while the secret-free UFFD protocol is guest_memfd-offset-based. TODO: add cab/sob from Patrick Signed-off-by: Nikita Kalyazin <[email protected]>
It contains two parts: - external: between the VMM thread and the UFFD handler - internal: between vCPUs and the VMM thread An outline of the workflow: - When a vCPU fault occurs, vCPU exits to userspace - The vCPU thread sends a message to the VMM thread via the userfault channel - The VMM thread forwards the message to the UFFD handler via the UDS socket - The UFFD hnadler populates the page, clears the corresponding bit in the userfault bitmap and sends a reply to Firecracker - The VMM thread receives the reply and forwards it to the vCPU via the userfault channel - The vCPU resumes execution Signed-off-by: Nikita Kalyazin <[email protected]>
This is required by Secret Freedom to implement the userfault protocol: vCPUs read notification of fault handling completions from the userfault channel. Signed-off-by: Nikita Kalyazin <[email protected]>
kvmclock is currently not supported by Secret Freedom and calling kvmclock_ctrl will always fail. Signed-off-by: Nikita Kalyazin <[email protected]>
In a regular VM, we mmap the memory snapshot file and supply the address in the KVM memory slot. In Secret Free VMs, we provide guest_memfd in the memory slot instead. There is no way we can restore a Secret Free VM from a file, unless we prepopulate the guest_memfd with the file content, which is inefficient and is not practically useful. Signed-off-by: Nikita Kalyazin <[email protected]>
It is not supported by Secret Freedom. Signed-off-by: Nikita Kalyazin <[email protected]>
This includes both functional and performance tests. Signed-off-by: Nikita Kalyazin <[email protected]>
Do not add a balloon device to a Secret Free VM as it is not currently supported. Signed-off-by: Nikita Kalyazin <[email protected]>
When taking a snapshot from a Secret Free VM, we create a bounce buffer to be able to pass it to the host kernel to store in a file. Exclude it from the memory monitor calculation. Signed-off-by: Nikita Kalyazin <[email protected]>
This is because the error type has changed due the implementation of snapshot restore support for Secret Free VMs. Signed-off-by: Nikita Kalyazin <[email protected]>
JackThomson2
approved these changes
Jun 19, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
Implement userfault support in Secret Freedom. The goal of this change is to be able to resume Secret-Free VMs via UFFD.
Major changes:
write
s to the guest_memfd to populate guest pages and clears bits in the userfault bitmap (memfd) to stop KVM from sending vCPU fault notificationsReason
This is needed to be able to restore snapshots where the VM was backed by guest_memfd.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
tools/devtool checkstyle
to verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
[ ] I have updated any relevant documentation (both in code and in the docs)in the PR.
[ ] I have mentioned all user-facing changes inCHANGELOG.md
.[ ] If a specific issue led to this PR, this PR closes the issue.[ ] When making API changes, I have followed theRunbook for Firecracker API changes.
[ ] I have tested all new and changed functionalities in unit tests and/orintegration tests.
[ ] I have linked an issue to every newTODO
.rust-vmm
.