Re: [PATCH] cgroup/rstat: avoid disabling irqs for O(num_cpu)

From: Mateusz Guzik
Date: Tue Apr 01 2025 - 11:49:48 EST


On Tue, Apr 1, 2025 at 5:00 PM Michal Koutný <mkoutny@xxxxxxxx> wrote:
> On Thu, Mar 27, 2025 at 06:47:56PM +0100, Mateusz Guzik <mjguzik@xxxxxxxxx> wrote:
> > I feel compelled to note atomics on x86-64 were expensive for as long
> > as the architecture was around so I'm confused what's up with the
> > resistance to the notion that they remain costly even with modern
> > uarchs. If anything, imo claims that they are cheap require strong
> > evidence.
>
> I don't there's strong resistance, your measurements show that it's not
> negligible under given conditions.
>
> The question is -- how much benefit would flushers have in practice with
> coalesced unlock-locks. There is the approach now with releasing for
> each CPU that is simple and benefits latency of irq dependants.
>

Toggling every n cpus instead of every single time is trivial to do.

I'm trying to avoid sending a patch in hopes of not getting CC'ed for
unrelated stuff later.

> If you see practical issues with the limited throughputs of stat readers
> (or flushers in general) because of this, please send a patch for
> discussion that resolves it while preserving (some of) the irq freedom.
>

This is some background maintenance work and it should do what's
feasible to not eat CPU.

The stock loop was behaving poorly in face of a high CPU count and it
makes excellent sense make it toggle the lock in *some* capacity.

I just don't believe going from 400+ CPUs straight to literally 1
every time is warranted. It seems the author felt justified with the
notion that it does not add overhead on contemporary hardware, but per
your own e-mail I demonstrated this does not hold.

Is this really going to suffer for toggling every 8 CPUs? that's a 50x
factor reduction

I would not be mailing here if the change was hard to do, but it
really is not. it's literally a counter in a loop getting checked.

> Also there is ongoing work of splitting up flushing per controller --
> I'd like to see whether the given locks become "small" enough to require
> no _irq exclusion at all during flushing.

the temp changes like the to stay for a long time.

tl;dr I don't believe going straight from 400 to 1 was properly
justified and I demonstrated how it hurts on a rather modest box. at
the same time a (likely) more than enough improvement over the stock
state can be trivially achieved while adding only a small fraction of
the overhead.

that said, there is bigger fish to fry elsewhere and I have no stake
in this code, so I'm not going to mail any further about this.
--
Mateusz Guzik <mjguzik gmail.com>