Re: [RFC PATCH v2] Introduce Hierarchical Per-CPU Counters
From: Roman Gushchin
Date: Tue Apr 08 2025 - 18:12:37 EST
On Tue, Apr 08, 2025 at 12:05:08PM -0400, Mathieu Desnoyers wrote:
> * Motivation
>
> The purpose of this hierarchical split-counter scheme is to:
>
> - Minimize contention when incrementing and decrementing counters,
> - Provide fast access to a sum approximation,
> - Provide a sum approximation with an acceptable accuracy level when
> scaling to many-core systems.
> - Provide approximate and precise comparison of two counters, and
> between a counter and a value.
>
> It aims at fixing the per-mm RSS tracking which has become too
> inaccurate for OOM killer purposes on large many-core systems [1].
It might be an overkill for the task from the memory overhead perspective.
Sure, for a very large process on a large machine it makes total sense,
but for smaller process it will waste a ton of memory.
Also, for relatively small number of CPUs (e.g. 8) it's also an overkill
from the complexity standpoint.
But as an idea it makes total sense to me, maybe just applicable to some
other tasks, e.g. global memory stats.
For the RSS tracking I wonder if what we really need is to go back to the
per-thread caching, but with some time-based propagation. E.g. a thread
should dump their cached value on going to sleep or being rescheduled.
This will bound the error to (64 * number of currently running threads),
which should be acceptable. We can also think of making the central counter
be an upper bound by increasing it first and moving the "pre-charged" value
to per-thread counters.
Thanks!