Re: [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
From: Sean Christopherson
Date: Fri Jul 29 2022 - 11:08:08 EST
On Fri, Jul 29, 2022, Kai Huang wrote:
> On Thu, 2022-07-28 at 22:17 +0000, Sean Christopherson wrote:
> > Fully re-evaluate whether or not MMIO caching can be enabled when SPTE
> > masks change; simply clearing enable_mmio_caching when a configuration
> > isn't compatible with caching fails to handle the scenario where the
> > masks are updated, e.g. by VMX for EPT or by SVM to account for the C-bit
> > location, and toggle compatibility from false=>true.
> >
> > Snapshot the original module param so that re-evaluating MMIO caching
> > preserves userspace's desire to allow caching. Use a snapshot approach
> > so that enable_mmio_caching still reflects KVM's actual behavior.
> >
..
> > @@ -340,6 +353,12 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask)
> > BUG_ON((u64)(unsigned)access_mask != access_mask);
> > WARN_ON(mmio_value & shadow_nonpresent_or_rsvd_lower_gfn_mask);
> >
> > + /*
> > + * Reset to the original module param value to honor userspace's desire
> > + * to (dis)allow MMIO caching. Update the param itself so that
> > + * userspace can see whether or not KVM is actually using MMIO caching.
> > + */
> > + enable_mmio_caching = allow_mmio_caching;
>
> I think the problem comes from MMIO caching mask/value are firstly set in
> kvm_mmu_reset_all_pte_masks() (which calls kvm_mmu_set_mmio_spte_mask() and may
> change enable_mmio_caching), and later vendor specific code _may_ or _may_not_
> call kvm_mmu_set_mmio_spte_mask() again to adjust the mask/value. And when it
> does, the second call from vendor specific code shouldn't depend on the
> 'enable_mmio_caching' value calculated in the first call in
> kvm_mmu_reset_all_pte_masks().
Correct.
> Instead of using 'allow_mmio_caching', should we just remove
> kvm_mmu_set_mmio_spte_mask() in kvm_mmu_reset_all_pte_masks() and enforce vendor
> specific code to always call kvm_mmu_set_mmio_spte_mask() depending on whatever
> hardware feature the vendor uses?
Hmm, I'd rather not force vendor code to duplicate the "basic" functionality.
It's somewhat silly to preserve the common path since both SVM and VMX need to
override it, but on the other hand those overrides are conditional.
Case in point, if I'm reading the below patch correctly, svm_shadow_mmio_mask will
be left '0' if the platform doesn't support memory encryption (svm_adjust_mmio_mask()
will bail early). That's a solvable problem, but then I think KVM just ends up
punting this issue to SVM to some extent.
Another flaw in the below patch is that enable_mmio_caching doesn't need to be
tracked on a per-VM basis. VMX with EPT can have different masks, but barring a
massive change in KVM or hardware, there will never be a scenario where caching is
enabled for one VM but not another.
And isn't the below patch also broken for TDX? For TDX, unless things have changed,
the mask+value is supposed to be SUPPRES_VE==0 && RWX==0. So either KVM is generating
the wrong mask (MAXPHYADDR < 51), or KVM is incorrectly marking MMIO caching as disabled
in the TDX case.
Lastly, in prepration for TDX, enable_mmio_caching should be changed to key off
of the _mask_, not the value. E.g. for TDX, the value will be '0', but the mask
should be SUPPRESS_VE | RWX.
> I am suggesting this way because in Isaku's TDX patch
>
> [PATCH v7 037/102] KVM: x86/mmu: Track shadow MMIO value/mask on a per-VM basis
>
> we will enable per-VM MMIO mask/value, which will remove global
> shadow_mmio_mask/shadow_mmio_value, and I am already suggesting something
> similar:
>
> https://lore.kernel.org/all/20220719084737.GU1379820@xxxxxxxxxxxxxxxxxxxxx/
>