Re: [Patch v4 03/18] KVM: x86/mmu: Track count of pages in KVM MMU page caches globally

From: Zhi Wang
Date: Thu Mar 09 2023 - 07:52:29 EST


On Thu, 9 Mar 2023 05:18:11 +0000
Mingwei Zhang <mizhang@xxxxxxxxxx> wrote:

> > >
> > > 1) Previously mmu_topup_memory_caches() works fine without a lock.
> > > 2) IMHO I was suspecting if this lock seems affects the parallelization
> > > of the TDP MMU fault handling.
> > >
> > > TDP MMU fault handling is intend to be optimized for parallelization fault
> > > handling by taking a read lock and operating the page table via atomic
> > > operations. Multiple fault handling can enter the TDP MMU fault path
> > > because of read_lock(&vcpu->kvm->mmu_lock) below.
> > >
> > > W/ this lock, it seems the part of benefit of parallelization is gone
> > > because the lock can contend earlier above. Will this cause performance
> > > regression?
> >
> > This is a per vCPU lock, with this lock each vCPU will still be able
> > to perform parallel fault handling without contending for lock.
> >
>
> I am curious how effective it is by trying to accquiring this per vCPU
> lock? If a vcpu thread should stay within the (host) kernel (vmx
> root/non-root) for the vast majority of the time, isn't the shrinker
> always fail to make any progress?

IMHO the lock is to prevent the faulting path from being disturbed by the
shrinker. I guess even a vCPU thread stays in the host kernel, the shrinker
should still be able to harvest the pages from the cache as long as there is
no faulting flood.

I am curious about the effectiveness as well. It would be nice there can be
some unit tests that people can try by themselves to see the results, like
when the shrinker isn't triggered, the faulting is still as effective as
before and when the shrinker is triggered, what would actually happen when
the system memory is under different pressure. (like how much the faulting
will be slowed down)