Re: [PATCH v7 0/7] mm: Hot page tracking and promotion infrastructure

From: Bharata B Rao

Date: Tue May 05 2026 - 06:52:44 EST


On 04-May-26 11:39 AM, Bharata B Rao wrote:
> Results
> =======
> Posted as replies to this mail thread.

Graph500 benchmark results:

Test system details
-------------------
3 node AMD Zen5 system with 2 regular NUMA nodes (0, 1) and a CXL node (2)

$ numactl -H
available: 3 nodes (0-2)
node 0 cpus: 0-95,192-287
node 0 size: 128460 MB
node 1 cpus: 96-191,288-383
node 1 size: 128893 MB
node 2 cpus:
node 2 size: 257993 MB
node distances:
node 0 1 2
0: 10 32 50
1: 32 10 60
2: 255 255 10

Hotness sources
---------------
NUMAB0 - Without NUMA Balancing in base case and with no source enabled
in the pghot case. No migrations occur.
NUMAB2 - Existing hot page promotion for the base case and
use of hint faults as source in the pghot case.
NUMAB3 - Enabled both regular and tiering mode of NUMA Balancing
(kernel.numa_balancing=3)

Pghot by default promotes after two accesses but for NUMAB2 source,
promotion is done after one access to match the base behaviour.
(/sys/kernel/debug/pghot/freq_threshold=1)

Graph500 details
----------------
Command: mpirun -n 128 --bind-to core --map-by core
graph500/src/graph500_reference_bfs 28 16

After the graph creation, the processes are stopped and data is migrated
to CXL node 2 before continuing so that BFS phase starts accessing lower
tier memory.

Total memory usage is slightly over 100GB and will fit within Node 0 and 1.
Hence there is no memory pressure to induce demotions.

harmonic_mean_TEPS - Higher is better
=====================================================================================
Base Base pghot-default
pghot-precise
NUMAB0 NUMAB2 NUMAB2 NUMAB2
=====================================================================================
harmonic_mean_TEPS 5.08026e+08 7.48633e+08 5.46257e+08 7.45101e+08
mean_time 8.45413 5.73702 7.86245 5.76421
median_TEPS 5.09236e+08 7.25058e+08 5.40525e+08 7.63752e+08
max_TEPS 5.15244e+08 1.03391e+09 8.51317e+08 9.7552e+08

pgpromote_success 0 13809474 13763582 13763155
numa_pte_updates 0 26746117 39502157 36368086
numa_hint_faults 0 13811769 24248272 21172314
=====================================================================================
pghot-default
NUMAB3
=====================================================================================
harmonic_mean_TEPS 7.00515e+08
mean_time 6.13109
median_TEPS 7.06813e+08
max_TEPS 7.63164e+08

pgpromote_success 13762087
numa_pte_updates 93632490
numa_hint_faults 70566306
=====================================================================================
- The base case shows a good improvement with NUMAB2 in harmonic_mean_TEPS.
- The same improvement gets maintained with pghot-precise too.
- pghot-default mode doesn't show benefit even when achieving similar page promotion
numbers. This mode doesn't track accessing NID and by default promotes to NID=0
which probably isn't all that beneficial as processes are running on both Node 0
and Node 1.
- pghot-default recovers the performance when balancing between toptier nodes
0 and 1 is enabled in addition to hot page promotion.