Re: [RFC v1 05/26] x86/traps: Add #VE support for TDX guest

From: Sean Christopherson
Date: Fri Feb 12 2021 - 15:45:55 EST


On Fri, Feb 12, 2021, Andy Lutomirski wrote:
>
> > On Feb 12, 2021, at 12:06 PM, Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> >
> > On Fri, Feb 12, 2021, Andy Lutomirski wrote:
> >>> On Fri, Feb 5, 2021 at 3:39 PM Kuppuswamy Sathyanarayanan
> >>> <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> wrote:
> >>>
> >>> From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
> >>>
> >>> The TDX module injects #VE exception to the guest TD in cases of
> >>> disallowed instructions, disallowed MSR accesses and subset of CPUID
> >>> leaves. Also, it's theoretically possible for CPU to inject #VE
> >>> exception on EPT violation, but the TDX module makes sure this does
> >>> not happen, as long as all memory used is properly accepted using
> >>> TDCALLs.
> >>
> >> By my very cursory reading of the TDX arch specification 9.8.2,
> >> "Secure" EPT violations don't send #VE. But the docs are quite
> >> unclear, or at least the docs I found are.
> >
> > The version I have also states that SUPPRESS_VE is always set. So either there
> > was a change in direction, or the public docs need to be updated. Lazy accept
> > requires a #VE, either from hardware or from the module. The latter would
> > require walking the Secure EPT tables on every EPT violation...
> >
> >> What happens if the guest attempts to access a secure GPA that is not
> >> ACCEPTed? For example, suppose the VMM does THH.MEM.PAGE.REMOVE on a secure
> >> address and the guest accesses it, via instruction fetch or data access.
> >> What happens?
> >
> > Well, as currently written in the spec, it will generate an EPT violation and
> > the host will have no choice but to kill the guest.
>
> Or page the page back in and try again?

The intended use isn't for swapping a page or migrating a page. Those flows
have dedicated APIs, and do not _remove_ a page.

E.g. the KVM RFC patches already support zapping Secure EPT entries if NUMA
balancing kicks in. But, in TDX terminology, that is a BLOCK/UNBLOCK operation.

Removal is for converting a private page to a shared page, and for paravirt
memory ballooning.

> In regular virt guests, if the host pages out a guest page, it’s the host’s
> job to put it back when needed. In paravirt, a well designed async of
> protocol can sometimes let the guest to useful work when this happens. If a
> guest (or bare metal) has its memory hot removed (via balloon or whatever)
> and the kernel messes up and accesses removed memory, the guest (or bare
> metal) is toast.
>
> I don’t see why TDX needs to be any different.

The REMOVE API isn't intended for swap. In fact, it can't be used for swap. If
a page is removed, its contents are lost. Because the original contents are
lost, the guest is required to re-accept the page so that the host can't
silently get the guest to consume a zero page that the guest thinks has valid
data.

For swap, the contents are preserved, and so explicit re-acceptance is not
required. From the guest's perspective, it's really just a high-latency memory
access.