Re: [RFC v1 05/26] x86/traps: Add #VE support for TDX guest
From: Sean Christopherson
Date: Fri Feb 12 2021 - 15:55:09 EST
On Fri, Feb 12, 2021, Dave Hansen wrote:
> On 2/12/21 12:37 PM, Sean Christopherson wrote:
> > There needs to be a mechanism for lazy/deferred/on-demand acceptance of pages.
> > E.g. pre-accepting every page in a VM with hundreds of GB of memory will be
> > ridiculously slow.
> >
> > #VE is the best option to do that:
> >
> > - Relatively sane re-entrancy semantics.
> > - Hardware accelerated.
> > - Doesn't require stealing an IRQ from the guest.
>
> TDX already provides a basic environment for the guest when it starts
> up. The guest has some known, good memory. The guest also has a very,
> very clear understanding of which physical pages it uses and when. It's
> staged, of course, as decompression happens and the guest comes up.
>
> But, the guest still knows which guest physical pages it accesses and
> when. It doesn't need on-demand faulting in of non-accepted pages. It
> can simply decline to expose non-accepted pages to the wider system
> before they've been accepted.
>
> It would be nuts to merrily free non-accepted pages into the page
> allocator and handle the #VE fallout as they're touched from
> god-knows-where.
>
> I don't see *ANY* case for #VE to occur inside the guest kernel, outside
> of *VERY* narrow places like copy_from_user(). Period. #VE from ring-0
> is not OK.
>
> So, no, #VE is not the best option. No #VE's in the first place is the
> best option.
Ah, I see what you're thinking.
Treating an EPT #VE as fatal was also considered as an option. IIUC it was
thought that finding every nook and cranny that could access a page, without
forcing the kernel to pre-accept huge swaths of memory, would be very difficult.
It'd be wonderful if that's not the case.