Re: [PATCH 2/4] KVM: x86/mmu: Defer "full" MMU setup until after vendor hardware_setup()
From: David Matlack
Date: Fri Jun 24 2022 - 20:16:16 EST
On Fri, Jun 24, 2022 at 11:27:33PM +0000, Sean Christopherson wrote:
> Defer MMU setup, and in particular allocation of pte_list_desc_cache,
> until after the vendor's hardware_setup() has run, i.e. until after the
> MMU has been configured by vendor code. This will allow a future commit
> to dynamically size pte_list_desc's array of sptes based on whether or
> not KVM is using TDP.
>
> Alternatively, the setup could be done in kvm_configure_mmu(), but that
> would require vendor code to call e.g. kvm_unconfigure_mmu() in teardown
> and error paths, i.e. doesn't actually save code and is arguably uglier.
>
> Note, keep the reset of PTE masks where it is to ensure that the masks
> are reset before the vendor's hardware_setup() runs, i.e. before the
> vendor code has a chance to manipulate the masks, e.g. VMX modifies masks
> even before calling kvm_configure_mmu().
>
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
[...]
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 17ac30b9e22c..ceb81e04aea3 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -6673,10 +6673,8 @@ void kvm_mmu_x86_module_init(void)
> * loaded as many of the masks/values may be modified by VMX or SVM, i.e. need
> * to be reset when a potentially different vendor module is loaded.
> */
> -int kvm_mmu_vendor_module_init(void)
> +void kvm_mmu_vendor_module_init(void)
> {
> - int ret = -ENOMEM;
> -
> /*
> * MMU roles use union aliasing which is, generally speaking, an
> * undefined behavior. However, we supposedly know how compilers behave
> @@ -6687,7 +6685,13 @@ int kvm_mmu_vendor_module_init(void)
> BUILD_BUG_ON(sizeof(union kvm_mmu_extended_role) != sizeof(u32));
> BUILD_BUG_ON(sizeof(union kvm_cpu_role) != sizeof(u64));
>
> + /* Reset the PTE masks before the vendor module's hardware setup. */
> kvm_mmu_reset_all_pte_masks();
> +}
> +
> +int kvm_mmu_hardware_setup(void)
> +{
Instead of putting this code in a new function and calling it after
hardware_setup(), we could put it in kvm_configure_mmu().
This will result in a larger patch diff, but has it eliminates a subtle
and non-trivial-to-verify dependency ordering between
kvm_configure_mmu() and kvm_mmu_hardware_setup() and it will co-locate
the initialization of nr_sptes_per_pte_list and the code that uses it to
create pte_list_desc_cache in a single function.