Re: [PATCH 0/7] KVM: x86: guest MAXPHYADDR and C-bit fixes

From: Tom Lendacky
Date: Thu Jun 24 2021 - 12:31:01 EST


On 6/23/21 6:05 PM, Sean Christopherson wrote:
> A few fixes centered around enumerating guest MAXPHYADDR and handling the
> C-bit in KVM.
>
> DISCLAIMER: I have no idea if patch 04, "Truncate reported guest
> MAXPHYADDR to C-bit if SEV is" is architecturally correct. The APM says
> the following about the C-bit in the context of SEV, but I can't for the
> life of me find anything in the APM that clarifies whether "effectively
> reduced" is supposed to apply to _only_ SEV guests, or any guest on an
> SEV enabled platform.
>
> Note that because guest physical addresses are always translated through
> the nested page tables, the size of the guest physical address space is
> not impacted by any physical address space reduction indicated in
> CPUID 8000_001F[EBX]. If the C-bit is a physical address bit however,
> the guest physical address space is effectively reduced by 1 bit.
>
> In practice, I have observed that Rome CPUs treat the C-bit as reserved for
> non-SEV guests (another disclaimer on this below). Long story short, commit
> ef4c9f4f6546 ("KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()")
> exposed the issue by inadvertantly causing selftests to start using GPAs
> with bit 47 set.
>
> That said, regardless of whether or not the behavior is intended, it needs
> to be addressed by KVM. I think the only difference is whether this is
> KVM's _only_ behavior, or whether it's gated by an erratum flag.
>
> The second disclaimer is that I haven't tested with memory encryption
> disabled in hardware. I wrote the patch assuming/hoping that only CPUs
> that report SEV=1 treat the C-bit as reserved, but I haven't actually
> tested the SEV=0 case on e.g. CPUs with only SME (we might have these
> platforms, but I've no idea how to access/find them), or CPUs with SME/SEV
> disabled in BIOS (again, I've no idea how to do this with our BIOS).

Here's an explanation of the physical address reduction for bare-metal and
guest.

With MSR 0xC001_0010[SMEE] = 0:
No reduction in host or guest max physical address.

With MSR 0xC001_0010[SMEE] = 1:
- Reduction in the host is enumerated by CPUID 0x8000_001F_EBX[11:6],
regardless of whether SME is enabled in the host or not. So, for example
on EPYC generation 2 (Rome) you would see a reduction from 48 to 43.
- There is no reduction in physical address in a legacy guest (non-SEV
guest), so the guest can use a 48-bit physical address
- There is a reduction of only the encryption bit in an SEV guest, so
the guest can use up to a 47-bit physical address. This is why the
Qemu command line sev-guest option uses a value of 1 for the
"reduced-phys-bits" parameter.

Thanks,
Tom

>
> Sean Christopherson (7):
> KVM: x86: Use guest MAXPHYADDR from CPUID.0x8000_0008 iff TDP is
> enabled
> KVM: x86: Use kernel's x86_phys_bits to handle reduced MAXPHYADDR
> KVM: x86: Truncate reported guest MAXPHYADDR to C-bit if SEV is
> supported
> KVM: x86/mmu: Do not apply HPA (memory encryption) mask to GPAs
> KVM: VMX: Refactor 32-bit PSE PT creation to avoid using MMU macro
> KVM: x86/mmu: Bury 32-bit PSE paging helpers in paging_tmpl.h
> KVM: x86/mmu: Use separate namespaces for guest PTEs and shadow PTEs
>
> arch/x86/kvm/cpuid.c | 38 +++++++++++++++++---
> arch/x86/kvm/mmu.h | 11 ++----
> arch/x86/kvm/mmu/mmu.c | 63 ++++++++-------------------------
> arch/x86/kvm/mmu/mmu_audit.c | 6 ++--
> arch/x86/kvm/mmu/mmu_internal.h | 14 ++++++++
> arch/x86/kvm/mmu/paging_tmpl.h | 52 ++++++++++++++++++++++++++-
> arch/x86/kvm/mmu/spte.c | 2 +-
> arch/x86/kvm/mmu/spte.h | 34 +++++++-----------
> arch/x86/kvm/mmu/tdp_iter.c | 6 ++--
> arch/x86/kvm/mmu/tdp_mmu.c | 2 +-
> arch/x86/kvm/svm/svm.c | 37 ++++++++++++++-----
> arch/x86/kvm/vmx/vmx.c | 2 +-
> arch/x86/kvm/x86.c | 3 ++
> arch/x86/kvm/x86.h | 1 +
> 14 files changed, 170 insertions(+), 101 deletions(-)
>