Re: [PATCH v2 2/3] KVM: x86/mmu: Use MMU shrinker to shrink KVM MMU memory caches
From: Sean Christopherson
Date: Thu Oct 24 2024 - 19:26:01 EST
On Fri, Oct 04, 2024, Vipin Sharma wrote:
> Use MMU shrinker to iterate through all the vCPUs of all the VMs and
> free pages allocated in MMU memory caches. Protect cache allocation in
> page fault and MMU load path from MMU shrinker by using a per vCPU
> mutex. In MMU shrinker, move the iterated VM to the end of the VMs list
> so that the pain of emptying cache spread among other VMs too.
>
> The specific caches to empty are mmu_shadow_page_cache and
> mmu_shadowed_info_cache as these caches store whole pages. Emptying them
> will give more impact to shrinker compared to other caches like
> mmu_pte_list_desc_cache{} and mmu_page_header_cache{}
>
> Holding per vCPU mutex lock ensures that a vCPU doesn't get surprised
> by finding its cache emptied after filling them up for page table
> allocations during page fault handling and MMU load operation. Per vCPU
> mutex also makes sure there is only race between MMU shrinker and all
> other vCPUs. This should result in very less contention.
>
> Signed-off-by: Vipin Sharma <vipinsh@xxxxxxxxxx>
> ---
...
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 213e46b55dda2..8e2935347615d 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4524,29 +4524,33 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
> if (r != RET_PF_INVALID)
> return r;
>
> + mutex_lock(&vcpu->arch.mmu_memory_cache_lock);
> r = mmu_topup_memory_caches(vcpu, false);
> if (r)
> - return r;
> + goto out_mmu_memory_cache_unlock;
>
> r = kvm_faultin_pfn(vcpu, fault, ACC_ALL);
> if (r != RET_PF_CONTINUE)
> - return r;
> + goto out_mmu_memory_cache_unlock;
>
> r = RET_PF_RETRY;
> write_lock(&vcpu->kvm->mmu_lock);
>
> if (is_page_fault_stale(vcpu, fault))
> - goto out_unlock;
> + goto out_mmu_unlock;
>
> r = make_mmu_pages_available(vcpu);
> if (r)
> - goto out_unlock;
> + goto out_mmu_unlock;
>
> r = direct_map(vcpu, fault);
>
> -out_unlock:
> +out_mmu_unlock:
> write_unlock(&vcpu->kvm->mmu_lock);
> kvm_release_pfn_clean(fault->pfn);
> +out_mmu_memory_cache_unlock:
> + mutex_unlock(&vcpu->arch.mmu_memory_cache_lock);
I've been thinking about this patch on and off for the past few weeks, and every
time I come back to it I can't shake the feeling that we came up with a clever
solution for a problem that doesn't exist. I can't recall a single complaint
about KVM consuming an unreasonable amount of memory for page tables. In fact,
the only time I can think of where the code in question caused problems was when
I unintentionally inverted the iterator and zapped the newest SPs instead of the
oldest SPs.
So, I'm leaning more and more toward simply removing the shrinker integration.