Re: Possible 5.19 regression for systems with 52-bit physical address support

From: Tom Lendacky
Date: Thu Jul 28 2022 - 12:05:43 EST


On 7/28/22 09:56, Sean Christopherson wrote:
On Thu, Jul 28, 2022, Michael Roth wrote:
On Thu, Jul 28, 2022 at 08:44:30AM -0500, Michael Roth wrote:
Hi Sean,

With this patch applied, AMD processors that support 52-bit physical

Sorry, threading got messed up. This is in reference to:

https://lore.kernel.org/lkml/20220420002747.3287931-1-seanjc@xxxxxxxxxx/#r

commit 8b9e74bfbf8c7020498a9ea600bd4c0f1915134d
Author: Sean Christopherson <seanjc@xxxxxxxxxx>
Date: Wed Apr 20 00:27:47 2022 +0000

KVM: x86/mmu: Use enable_mmio_caching to track if MMIO caching is enabled

Oh crud. I suspect I also broke EPT with MAXPHYADDR=52; the initial
kvm_mmu_reset_all_pte_masks() will clear the flag, and it won't get set back to
true even though EPT can generate a reserved bit fault.

address will result in MMIO caching being disabled. This ends up
breaking SEV-ES and SNP, since they rely on the MMIO reserved bit to
generate the appropriate NAE MMIO exit event.

This failure can also be reproduced on Milan by disabling mmio_caching
via KVM module parameter.

Hrm, this is a separate bug of sorts. SEV-ES (and later) needs to have an explicit
check the MMIO caching is enabled, e.g. my bug aside, if KVM can't use MMIO caching
due to the location of the C-bit, then SEV-ES must be disabled.

Speaking of which, what prevents hardware (firmware?) from configuring the C-bit
position to be bit 51 and thus preventing KVM from generating the reserved #NPF?

On the hypervisor side, there is more than a single bit of physical addressing reduction when memory encryption is enabled. So even when the C-bit position is bit 51, some number of bits below 51 are reserved and will cause the reserved #NPF.

Thanks,
Tom


In the case of AMD, guests use a separate physical address range that
and so there are still reserved bits available to make use of the MMIO
caching. This adjustment happens in svm_adjust_mmio_mask(), but since
mmio_caching_enabled flag is 0, any attempts to update masks get
ignored by kvm_mmu_set_mmio_spte_mask().

Would adding 'force' parameter to kvm_mmu_set_mmio_spte_mask() that
svm_adjust_mmio_mask() can set to ignore enable_mmio_caching be
reasonable fix, or should we take a different approach?

Different approach. To fix the bug with enable_mmio_caching not being set back to
true when a vendor-specific mask allows caching, I believe the below will do the
trick.

The SEV-ES dependency is easy to solve, but will require a few patches in order
to get the necessary ordering; svm_adjust_mmio_mask() is currently called _after_
SEV-ES is configured.

I'll test (as much as I can, I don't think we have platforms with MAXPHYADDR=52)
and get a series sent out later today.

diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 7314d27d57a4..a57add994b8d 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -19,8 +19,9 @@
#include <asm/memtype.h>
#include <asm/vmx.h>

-bool __read_mostly enable_mmio_caching = true;
-module_param_named(mmio_caching, enable_mmio_caching, bool, 0444);
+bool __read_mostly enable_mmio_caching;
+static bool __read_mostly __enable_mmio_caching = true;
+module_param_named(mmio_caching, __enable_mmio_caching, bool, 0444);

u64 __read_mostly shadow_host_writable_mask;
u64 __read_mostly shadow_mmu_writable_mask;
@@ -340,6 +341,8 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask)
BUG_ON((u64)(unsigned)access_mask != access_mask);
WARN_ON(mmio_value & shadow_nonpresent_or_rsvd_lower_gfn_mask);

+ enable_mmio_caching = __enable_mmio_caching;
+
if (!enable_mmio_caching)
mmio_value = 0;