Re: [PATCH 05/16] KVM: x86/mmu: Use synthetic page fault error code to indicate private faults

From: Sean Christopherson
Date: Wed Mar 06 2024 - 09:53:49 EST


On Wed, Mar 06, 2024, Xu Yilun wrote:
> On Tue, Feb 27, 2024 at 06:41:36PM -0800, Sean Christopherson wrote:
> > Add and use a synthetic, KVM-defined page fault error code to indicate
> > whether a fault is to private vs. shared memory. TDX and SNP have
> > different mechanisms for reporting private vs. shared, and KVM's
> > software-protected VMs have no mechanism at all. Usurp an error code
> > flag to avoid having to plumb another parameter to kvm_mmu_page_fault()
> > and friends.
> >
> > Alternatively, KVM could borrow AMD's PFERR_GUEST_ENC_MASK, i.e. set it
> > for TDX and software-protected VMs as appropriate, but that would require
> > *clearing* the flag for SEV and SEV-ES VMs, which support encrypted
> > memory at the hardware layer, but don't utilize private memory at the
> > KVM layer.
>
> I see this alternative in other patchset. And I still don't understand why
> synthetic way is better after reading patch #5-7. I assume for SEV(-ES) if
> there is spurious PFERR_GUEST_ENC flag set in error code when private memory
> is not used in KVM, then it is a HW issue or SW bug that needs to be caught
> and warned, and by the way cleared.

The conundrum is that SEV(-ES) support _encrypted_ memory, but cannot support
what KVM calls "private" memory. In quotes because SEV(-ES) memory encryption
does provide confidentially/privacy, but in KVM they don't support memslots that
can be switched between private and shared, e.g. will return false for
kvm_arch_has_private_mem().

And KVM _can't_ sanely use private/shared memslots for SEV(-ES), because it's
impossible to intercept implicit conversions by the guest, i.e. KVM can't prevent
the guest from encrypting a page that KVM thinks is private, and vice versa.

But from hardware's perspective, while the APM is a bit misleading and I don't
love the behavior, I can't really argue that it's a hardware bug if the CPU sets
PFERR_GUEST_ENC on a fault where the guest access had C-bit=1, because the access
_was_ indeed to encrypted memory.

Which is a long-winded way of saying the unwanted PFERR_GUEST_ENC faults aren't
really spurious, nor are they hardware or software bugs, just another unfortunate
side effect of the fact that the hypervisor can't intercept shared<->encrypted
conversions for SEV(-ES) guests. And that means that KVM can't WARN, because
literally every SNP-capable CPU would yell when running SEV(-ES) guests.

I also don't like the idea of usurping PFERR_GUEST_ENC to have it mean something
different in KVM as compared to how hardware defines/uses the flag.

Lastly, the approach in Paolo's series to propagate PFERR_GUEST_ENC to .is_private
iff the VM has private memory is brittle, in that it would be too easy for KVM
code that has access to the error code to do the wrong thing and interpret
PFERR_GUEST_ENC has meaning "private".