Re: [Patch v3 1/9] KVM: x86/mmu: Repurpose KVM MMU shrinker to purge shadow page caches

From: Vipin Sharma
Date: Tue Jan 03 2023 - 12:40:14 EST


On Thu, Dec 29, 2022 at 1:15 PM David Matlack <dmatlack@xxxxxxxxxx> wrote:
>
> On Wed, Dec 28, 2022 at 02:07:49PM -0800, Vipin Sharma wrote:
> > On Tue, Dec 27, 2022 at 10:37 AM Ben Gardon <bgardon@xxxxxxxxxx> wrote:
> > > On Wed, Dec 21, 2022 at 6:35 PM Vipin Sharma <vipinsh@xxxxxxxxxx> wrote:
> > > >
> > > > Tested this change by running dirty_log_perf_test while dropping cache
> > > > via "echo 2 > /proc/sys/vm/drop_caches" at 1 second interval
> > > > continuously. There were WARN_ON(!mc->nobjs) messages printed in kernel
> > > > logs from kvm_mmu_memory_cache_alloc(), which is expected.
> > >
> > > Oh, that's not a good thing. I don't think we want to be hitting those
> > > warnings. For one, kernel warnings should not be expected behavior,
> > > probably for many reasons, but at least because Syzbot will find it.
> > > In this particular case, we don't want to hit that because in that
> > > case we'll try to do a GFP_ATOMIC, which can fail, and if it fails,
> > > we'll BUG:
> > >
> > > void *kvm_mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc)
> > > {
> > > void *p;
> > >
> > > if (WARN_ON(!mc->nobjs))
> > > p = mmu_memory_cache_alloc_obj(mc, GFP_ATOMIC | __GFP_ACCOUNT);
> > > else
> > > p = mc->objects[--mc->nobjs];
> > > BUG_ON(!p);
> > > return p;
> > > }
> > >
> > > Perhaps the risk of actually panicking is small, but it probably
> > > indicates that we need better error handling around failed allocations
> > > from the cache.
> > > Or, the slightly less elegant approach might be to just hold the cache
> > > lock around the cache topup and use of pages from the cache, but
> > > adding better error handling would probably be cleaner.
> >
> > I was counting on the fact that shrinker will ideally run only in
> > extreme cases, i.e. host is running on low memory. So, this WARN_ON
> > will only be rarely used. I was not aware of Syzbot, it seems like it
> > will be a concern if it does this kind of testing.
>
> In an extreme low-memory situation, forcing vCPUS to do GFP_ATOMIC
> allocations to handle page faults is risky. Plus it's a waste of time to
> free that memory since it's just going to get immediately reallocated.
>
> >
> > I thought about keeping a mutex, taking it during topup and releasing
> > it after the whole operation is done but I stopped it as the duration
> > of holding mutex will be long and might block the memory shrinker
> > longer. I am not sure though, if this is a valid concern.
>
> Use mutex_trylock() to skip any vCPUs that are currently handling page
> faults.

oh yeah! Thanks.