Re: [PATCH 2/4] KVM: x86/mmu: Defer "full" MMU setup until after vendor hardware_setup()

From: David Matlack
Date: Mon Jun 27 2022 - 18:50:19 EST


On Mon, Jun 27, 2022 at 03:40:49PM +0000, Sean Christopherson wrote:
> On Sat, Jun 25, 2022, David Matlack wrote:
> > On Fri, Jun 24, 2022 at 11:27:33PM +0000, Sean Christopherson wrote:
> > > Alternatively, the setup could be done in kvm_configure_mmu(), but that
> > > would require vendor code to call e.g. kvm_unconfigure_mmu() in teardown
> > > and error paths, i.e. doesn't actually save code and is arguably uglier.
> > [...]
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index 17ac30b9e22c..ceb81e04aea3 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -6673,10 +6673,8 @@ void kvm_mmu_x86_module_init(void)
> > > * loaded as many of the masks/values may be modified by VMX or SVM, i.e. need
> > > * to be reset when a potentially different vendor module is loaded.
> > > */
> > > -int kvm_mmu_vendor_module_init(void)
> > > +void kvm_mmu_vendor_module_init(void)
> > > {
> > > - int ret = -ENOMEM;
> > > -
> > > /*
> > > * MMU roles use union aliasing which is, generally speaking, an
> > > * undefined behavior. However, we supposedly know how compilers behave
> > > @@ -6687,7 +6685,13 @@ int kvm_mmu_vendor_module_init(void)
> > > BUILD_BUG_ON(sizeof(union kvm_mmu_extended_role) != sizeof(u32));
> > > BUILD_BUG_ON(sizeof(union kvm_cpu_role) != sizeof(u64));
> > >
> > > + /* Reset the PTE masks before the vendor module's hardware setup. */
> > > kvm_mmu_reset_all_pte_masks();
> > > +}
> > > +
> > > +int kvm_mmu_hardware_setup(void)
> > > +{
> >
> > Instead of putting this code in a new function and calling it after
> > hardware_setup(), we could put it in kvm_configure_mmu().a
>
> Ya, I noted that as an alternative in the changelog but obviously opted to not
> do the allocation in kvm_configure_mmu().

Doh! My mistake. The idea to use kvm_configure_mmu() came to me while
reviewing patch 3 and I totally forgot about that blurb in the commit
message when I came back here to leave the suggestion.

> I view kvm_configure_mmu() as a necessary
> evil. Ideally vendor code wouldn't call into the MMU during initialization, and
> common x86 would fully dictate the order of calls so that MMU setup. We could force
> that, but it'd require something gross like filling a struct passed into
> ops->hardware_setup(), and probably would be less robust (more likely to omit a
> "required" field).
>
> In other words, I like the explicit kvm_mmu_hardware_setup() call from common x86,
> e.g. to show that vendor code needs to do setup before the MMU, and so that MMU
> setup isn't buried in a somewhat arbitrary location in vendor hardware setup.

Agreed, but if we're not going to get rid of kvm_configure_mmu(), we're
stuck with vendor-specific code calling into the MMU code during
hardware setup either way.

>
> I'm not dead set against handling this in kvm_configure_mmu() (though I'd probably
> vote to rename it to kvm_mmu_hardware_setup()) if anyone has a super strong opinion.

Your call. I'll put in a vote for using kvm_configure_mmu() and renaming
to kvm_mmu_hardware_setup().

>
> > This will result in a larger patch diff, but has it eliminates a subtle
> > and non-trivial-to-verify dependency ordering between
>
> Verification is "trivial" in that this WARN will fire if the order is swapped:
>
> if (WARN_ON_ONCE(!nr_sptes_per_pte_list))
> return -EIO;

Ah I missed that, that's good. Although I was thinking more from a code
readability standpoint.

>
> > kvm_configure_mmu() and kvm_mmu_hardware_setup() and it will co-locate
> > the initialization of nr_sptes_per_pte_list and the code that uses it to
> > create pte_list_desc_cache in a single function.