Re: [PATCH v3] mm: memory-tiering: fix PGPROMOTE_CANDIDATE counting
From: Andrew Morton
Date: Mon Sep 01 2025 - 15:59:26 EST
On Mon, 1 Sep 2025 17:01:22 +0800 Ruan Shiyang <ruansy.fnst@xxxxxxxxxxx> wrote:
> Goto-san reported confusing pgpromote statistics where the
> pgpromote_success count significantly exceeded pgpromote_candidate.
>
> On a system with three nodes (nodes 0-1: DRAM 4GB, node 2: NVDIMM 4GB):
> # Enable demotion only
> echo 1 > /sys/kernel/mm/numa/demotion_enabled
> numactl -m 0-1 memhog -r200 3500M >/dev/null &
> pid=$!
> sleep 2
> numactl memhog -r100 2500M >/dev/null &
> sleep 10
> kill -9 $pid # terminate the 1st memhog
> # Enable promotion
> echo 2 > /proc/sys/kernel/numa_balancing
>
> After a few seconds, we observeed `pgpromote_candidate < pgpromote_success`
> $ grep -e pgpromote /proc/vmstat
> pgpromote_success 2579
> pgpromote_candidate 0
>
> In this scenario, after terminating the first memhog, the conditions for
> pgdat_free_space_enough() are quickly met, and triggers promotion.
> However, these migrated pages are only counted for in PGPROMOTE_SUCCESS,
> not in PGPROMOTE_CANDIDATE.
>
> To solve these confusing statistics, introduce PGPROMOTE_CANDIDATE_NRL to
> count the missed promotion pages. And also, not counting these pages into
> PGPROMOTE_CANDIDATE is to avoid changing the existing algorithm or
> performance of the promotion rate limit.
>
> ...
>
It would be good to have a Fixes: here, to tell people how far back to
backport it.
Could be either c6833e10008f or c959924b0dc5 afaict. I'll go with
c6833e10008f, OK?