Re: [RFC v2-fix 1/1] x86/tdx: Handle in-kernel MMIO

From: Dave Hansen
Date: Tue May 18 2021 - 12:08:37 EST


On 5/18/21 8:56 AM, Andi Kleen wrote:
> On 5/18/2021 8:00 AM, Dave Hansen wrote:
>> On 5/17/21 5:48 PM, Kuppuswamy Sathyanarayanan wrote:
>>> From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
>>>
>>> In traditional VMs, MMIO tends to be implemented by giving a
>>> guest access to a mapping which will cause a VMEXIT on access.
>>> That's not possible in TDX guest.
>> Why is it not possible?
>
> For once the TDX module doesn't support uncached mappings (IgnorePAT is
> always 1)

Actually, I was thinking more along the lines of why the architecture
doesn't have VMEXITs: VMEXITs expose guest state to the host and VMMs
use that state to emulate MMIO. TDX guests don't trust the host and
can't have that arbitrary state exposed to the host. So, they sanitize
the state in the #VE handler and make a *controlled* transition into the
host with a TDCALL rather than an uncontrolled VMEXIT.

>>> For now we only handle a subset of instructions that the kernel
>>> uses for MMIO operations. User-space access triggers SIGBUS.
>> How do you know which instructions the kernel uses?
>
> They're all in MMIO macros.

I've heard exactly the opposite from the TDX team in the past. What I
remember was a claim that one can not just leverage the MMIO macros as a
single point to avoid MMIO. I remember being told that not all code in
the kernel that does MMIO uses these macros. APIC MMIO's were called
out as a place that does not use the MMIO macros.

I'm confused now.

>>   How do you know that the compiler won't change them?
>
> The macros try hard to prevent that because it would likely break real
> MMIO too.
>
> Besides it works for others, like AMD-SEV today and of course all the
> hypervisors that do the same.

That would be some excellent information for the changelog.