Re: [PATCH v3] x86/boot/sev: Avoid shared GHCB page for early memory acceptance

From: Ard Biesheuvel
Date: Fri Apr 11 2025 - 15:01:10 EST


On Fri, 11 Apr 2025 at 20:40, Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> On Thu, Apr 10, 2025 at 03:28:51PM +0200, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@xxxxxxxxxx>
> >
> > Communicating with the hypervisor using the shared GHCB page requires
> > clearing the C bit in the mapping of that page. When executing in the
> > context of the EFI boot services, the page tables are owned by the
> > firmware, and this manipulation is not possible.
> >
> > So switch to a different API for accepting memory in SEV-SNP guests, one
>
> That being the GHCB MSR protocol, it seems.
>

Yes.

> And since Tom co-developed, I guess we wanna do that.
>
> But then how much slower do we become?
>

Non-EFI stub boot will become slower if the memory that is used to
decompress the kernel has not been accepted yet. But given how heavily
SEV-SNP depends on EFI boot, this typically only happens on kexec, as
that is the only boot path that goes through the traditional
decompressor.

> And nothing in here talks about why that GHCB method worked or didn't
> work before and that it is ok or not ok why we're axing that off.
>

---%<---
The GHCB shared page method never worked for accepting memory from the
EFI stub, but this is rarely needed in practice: when using the higher
level page allocation APIs, the firmware will make sure that memory is
accepted before it is returned. The only use case for explicit memory
acceptance by the EFI stub is when populating the 'unaccepted memory'
bitmap, which tracks unaccepted memory at a 2MB granularity, and so
chunks of unaccepted memory that are misaligned wrt that are accepted
without being allocated or used.
---%<---

> I'm somehow missing that aspect of why that change is warranted...
>

This never worked correctly for SEV-SNP, we're just lucky the firmware
appears to accept memory in 2+ MB batches and so these misaligned
chunks are rare in practice. Tom did manage to trigger it IIUC by
giving a VM an amount of memory that is not a multiple of 2M.