Skip to content

openhcl/tdx: inject machine check on ept exits caused by bad guest addresses #1340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

chris-oo
Copy link
Member

The guest should never touch addresses that are not described to it in the memory layout. Previously, we emulated this exit but we should instead be injecting a machine check.

Additionally, a guest accessing a page with the incorrect shared bit set also results in a machine check. This otherwise will result in an emulation failure later, causing an OpenHCL crash.

PENDING: tests

@chris-oo chris-oo requested a review from a team as a code owner May 12, 2025 19:05
// For now, we just check if the exit was suprious or if we
// should inject a machine check. An exit is considered spurious
// if the gpa is accessible.
if self.partition.gm[intercepted_vtl].check_gpa_readable(gpa) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to copy these checks over to SNP? Or put them in some shared location in hardware_cvm?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so yes - let me see if i can do a quick attempt. But SNP also needs to be fixed to not rely on the hypervisor supplied intercept message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixing SNP requires digging thru the specs. I'd like to take the TDX change on the first try so we can get some wider coverage.

?ept_info,
"guest accessed gpa not described in memory layout, injecting MC"
);
self.inject_mc(intercepted_vtl);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry this is too aggressive--it's not necessarily the case that all MMIO resides inside the MMIO ranges that are listed in the topology. E.g., I don't see how XAPIC works with this approach.

So I think either you need to weaken this check, or we need to do more work to ensure the topology is completely accurate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... we don't support XAPIC today in TDX today, right, only x2apic?

I think you are right though that this check is very strict. Note that this is what our other paravisor implementation does.

For non CVMs I am a bit worried about this being overzealous (and we probably don't need to be so strict). I bet we'd need to make this handling a bit more configurable - let me think about it some more.

)
.await?;
}
Some(AddressType::Ram) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to handle overlay pages as well. Specifically, the MNF overlay will be in RAM but needs to be handled via emulation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't support MNF today right? the intention about this being limited was the TODO comment above about VTL pages and MNF

.lower_vtl_memory_layout
.probe_address(canonical_gpa);

match address_type {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this logic should ultimately move into the virt_support_x86emu crate so that we don't have to replicate it everywhere. I.e., we should probably do this same thing for standard VMs with openvmm, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. We discussed a bit offline, but I think we'll need to make the policy configurable. I'll take a stab.

@chris-oo chris-oo marked this pull request as draft May 15, 2025 18:23
@chris-oo chris-oo force-pushed the inject-mc-tdx branch 2 times, most recently from 925a86b to 57c5608 Compare May 28, 2025 15:27
@chris-oo chris-oo added the backport_2505 Change should be backported to the release/2505 branch label May 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport_2505 Change should be backported to the release/2505 branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants