Re: [PATCH V2] cgroup/rstat: Avoid thundering herd problem by kswapd across NUMA nodes
From: Shakeel Butt
Date: Mon Jun 24 2024 - 20:25:04 EST
On Mon, Jun 24, 2024 at 03:21:22PM GMT, Yosry Ahmed wrote:
> On Mon, Jun 24, 2024 at 3:17 PM Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote:
> >
> > On Mon, Jun 24, 2024 at 02:43:02PM GMT, Yosry Ahmed wrote:
> > [...]
> > > >
> > > > > There is also
> > > > > a heuristic in zswap that may writeback more (or less) pages that it
> > > > > should to the swap device if the stats are significantly stale.
> > > > >
> > > >
> > > > Is this the ratio of MEMCG_ZSWAP_B and MEMCG_ZSWAPPED in
> > > > zswap_shrinker_count()? There is already a target memcg flush in that
> > > > function and I don't expect root memcg flush from there.
> > >
> > > I was thinking of the generic approach I suggested, where we can avoid
> > > contending on the lock if the cgroup is a descendant of the cgroup
> > > being flushed, regardless of whether or not it's the root memcg. I
> > > think this would be more beneficial than just focusing on root
> > > flushes.
> >
> > Yes I agree with this but what about skipping the flush in this case?
> > Are you ok with that?
>
> Sorry if I am confused, but IIUC this patch affects all root flushes,
> even for userspace reads, right? In this case I think it's not okay to
> skip the flush without waiting for the ongoing flush.
So, we differentiate between userspace and in-kernel users. For
userspace, we should not skip flush and for in-kernel users, we can skip
if flushing memcg is the ancestor of the given memcg. Is that what you
are saying?