Re: [PATCH] mm: use per-numa-node atomics instead of percpu_counters

From: Mathieu Desnoyers
Date: Thu Mar 27 2025 - 16:35:46 EST


On 2025-03-26 19:36, Mateusz Guzik wrote:
[...]

Hell, it may be your patch as is can be easily repurposed to
decentralize the main percpu counter? I mean perhaps there is no need
for any fancy hierarchical structure.

Here is an initial attempt at a design document for the hierarchical
percpu counters. Feedback welcome!

Split Counters With Binary Tree Approximation Propagation
=========================================================

Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
March 27, 2025

* Propagation diagram when reaching batch size thresholds (± batch size):

Example diagram for 8 CPUs:

log2(8) = 3 levels

At each level, each pair propagates its values to the next level when
reaching the batch size thresholds.

Counters at levels 0, 1, 2 can be kept on a single byte (±128 range).
Counter at level 3 can be kept on a 32/64-bit counter.

Level 0: 0 1 2 3 4 5 6 7
| / | / | / | /
| / | / | / | /
| / | / | / | /
Level 1: 0 1 2 3
| / | /
| / | /
| / | /
Level 2: 0 1
| /
| /
| /
Level 3: 0


* Inaccuracy:

BATCH(level N): Level N batch size.

Example for BATCH(level 0) = 4

BATCH(level 0) = 4
BATCH(level 1) = 8
BATCH(level 2) = 16
BATCH(level N) = BATCH(level 0) * 2^N

per-counter global
inaccuracy inaccuracy
Level 0: ± 3 ± 24 (8 * 3)
Level 1: ± 7 ± 28 (4 * 7)
Level 2: ± 15 ± 30 (2 * 15)
Total: ------ ± 82 (log2(nr_cpus) * BATCH(level 0) * nr_cpus - Sum[0 .. log2(nr_cpus) - 1](nr_cpus / 2^n)

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com