Re: [Patch v4 03/18] KVM: x86/mmu: Track count of pages in KVM MMU page caches globally
From: Vipin Sharma
Date: Wed Mar 08 2023 - 17:17:35 EST
On Wed, Mar 8, 2023 at 12:33 PM Zhi Wang <zhi.wang.linux@xxxxxxxxx> wrote:
>
> On Mon, 6 Mar 2023 14:41:12 -0800
> Vipin Sharma <vipinsh@xxxxxxxxxx> wrote:
> > +/**
> > + * Caller should hold mutex lock corresponding to cache, if available.
> > + */
> > +static int mmu_topup_sp_memory_cache(struct kvm_mmu_memory_cache *cache,
> > + int min)
> > +{
> > + int orig_nobjs, r;
> > +
> > + orig_nobjs = cache->nobjs;
> > + r = kvm_mmu_topup_memory_cache(cache, min);
> > + if (orig_nobjs != cache->nobjs)
> > + percpu_counter_add(&kvm_total_unused_cached_pages,
> > + (cache->nobjs - orig_nobjs));
> > +
> > + return r;
> > +}
> > +
>
> Maybe kvm_mmu_topup_shadow_page_cache() would be better?
>
> As a user of kvm_mmu_topup_memory_cache(), mmu_topup_memory_cache() is not
> supposed to directly touch the kvm_mmu_memory_cache meta data.
>
> The name "mmu_topup_sp_memory_cache()" seems similar with "mmu_topup_memory_cache()".
> Renaming it would make its level self-documenting.
>
Sounds good. I will rename it.
> > @@ -4396,25 +4438,28 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu,
> > if (r != RET_PF_INVALID)
> > return r;
> >
> > + mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock);
>
> Can you elaborate more why this lock is required? When will this lock contend?
This lock is not needed in this patch. In the patch 4 when I am
freeing up the cache in MMU shrinker this lock is used. In an internal
discussion Sean also mentioned it. I will move it to the patch where
it is actually used.
>
> 1) Previously mmu_topup_memory_caches() works fine without a lock.
> 2) IMHO I was suspecting if this lock seems affects the parallelization
> of the TDP MMU fault handling.
>
> TDP MMU fault handling is intend to be optimized for parallelization fault
> handling by taking a read lock and operating the page table via atomic
> operations. Multiple fault handling can enter the TDP MMU fault path
> because of read_lock(&vcpu->kvm->mmu_lock) below.
>
> W/ this lock, it seems the part of benefit of parallelization is gone
> because the lock can contend earlier above. Will this cause performance
> regression?
This is a per vCPU lock, with this lock each vCPU will still be able
to perform parallel fault handling without contending for lock.
>
> If the lock will not contend above, then I am not sure if we need it.
>
Not in this patch, but in patch 4 we will need it when clearing cache
via MMU shrinker. I will move it to the patch where it is actually
needed.