Re: [PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg

From: Yosry Ahmed
Date: Thu Oct 12 2023 - 22:39:09 EST


On Thu, Oct 12, 2023 at 7:33 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
>
> On Thu, Oct 12, 2023 at 04:28:49PM -0700, Yosry Ahmed wrote:
> > [..]
> > > > >
> > > > > Using next-20231009 and a similar 44 core machine with hyperthreading
> > > > > disabled, I ran 22 instances of netperf in parallel and got the
> > > > > following numbers from averaging 20 runs:
> > > > >
> > > > > Base: 33076.5 mbps
> > > > > Patched: 31410.1 mbps
> > > > >
> > > > > That's about 5% diff. I guess the number of iterations helps reduce
> > > > > the noise? I am not sure.
> > > > >
> > > > > Please also keep in mind that in this case all netperf instances are
> > > > > in the same cgroup and at a 4-level depth. I imagine in a practical
> > > > > setup processes would be a little more spread out, which means less
> > > > > common ancestors, so less contended atomic operations.
> > > >
> > > >
> > > > (Resending the reply as I messed up the last one, was not in plain text)
> > > >
> > > > I was curious, so I ran the same testing in a cgroup 2 levels deep
> > > > (i.e /sys/fs/cgroup/a/b), which is a much more common setup in my
> > > > experience. Here are the numbers:
> > > >
> > > > Base: 40198.0 mbps
> > > > Patched: 38629.7 mbps
> > > >
> > > > The regression is reduced to ~3.9%.
> > > >
> > > > What's more interesting is that going from a level 2 cgroup to a level
> > > > 4 cgroup is already a big hit with or without this patch:
> > > >
> > > > Base: 40198.0 -> 33076.5 mbps (~17.7% regression)
> > > > Patched: 38629.7 -> 31410.1 (~18.7% regression)
> > > >
> > > > So going from level 2 to 4 is already a significant regression for
> > > > other reasons (e.g. hierarchical charging). This patch only makes it
> > > > marginally worse. This puts the numbers more into perspective imo than
> > > > comparing values at level 4. What do you think?
> > >
> > > I think it's reasonable.
> > >
> > > Especially comparing to how many cachelines we used to touch on the
> > > write side when all flushing happened there. This looks like a good
> > > trade-off to me.
> >
> > Thanks.
> >
> > Still wanting to figure out if this patch is what you suggested in our
> > previous discussion [1], to add a
> > Suggested-by if appropriate :)
> >
> > [1]https://lore.kernel.org/lkml/20230913153758.GB45543@xxxxxxxxxxx/
>
> Haha, sort of. I suggested the cgroup-level flush-batching, but my
> proposal was missing the clever upward propagation of the pending stat
> updates that you added.
>
> You can add the tag if you're feeling generous, but I wouldn't be mad
> if you don't!

I like to think that I am a generous person :)

Will add it in the next respin.