Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled

From: Huang, Ying

Date: Wed Apr 01 2026 - 23:32:20 EST

Donet Tom <donettom@xxxxxxxxxxxxx> writes:

> In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is
> disabled and the pages are on the lower tier, the pages may still be
> promoted.
>
> This happens because task_numa_work() updates the last_cpupid field to
> record the last access time only when NUMA_BALANCING_MEMORY_TIERING is
> enabled and the folio is on the lower tier. If
> NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field
> can retains a valid last CPU id.
>
> In should_numa_migrate_memory(), the decision checks whether
> NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower
> tier, and last_cpupid is invalid. However, the last_cpupid can be
> valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition
> evaluates to false and migration is allowed.
>
> This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is
> disabled and the folio is on the lower tier.
>
> Behavior before this change:
> ============================
> - If NUMA_BALANCING_NORMAL is enabled, migration occurs between
> nodes within the same memory tier, and promotion from lower
> tier to higher tier may also happen.
>
> - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from
> lower tier to higher tier nodes is allowed.
>
> Behavior after this change:
> ===========================
> - If NUMA_BALANCING_NORMAL is enabled, migration will occur only
> between nodes within the same memory tier.
>
> - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower
> tier to higher tier nodes will be allowed.
>
> - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are
> enabled, both migration (same tier) and promotion (cross tier) are
> allowed.
>
> Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency")
> Signed-off-by: Donet Tom <donettom@xxxxxxxxxxxxx>
> ---
> v1 -> v2
> ========
> 1. Dropped changes in task_numa_fault() since the original changes
> already handle runtime disabling of NUMA_BALANCING_MEMORY_TIERING.
>
> v1 -> https://lore.kernel.org/all/20260320092251.1290207-1-donettom@xxxxxxxxxxxxx/
> ---
> kernel/sched/fair.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index bf948db905ed..4b43809a3fb1 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
> this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid);
> last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid);
>
> + /*
> + * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
> + * and the pages are on the lower tier.
> + */
> if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
> - !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid))
> + !node_is_toptier(src_nid))
> return false;
>
> /*

No. Even if NUMA_BALANCING_MEMORY_TIERING is disabled, we should still
allow migrate pages from lower tier to higher tier via
NUMA_BALANCING_NORMAL. If we have precious DDR, why waste it? This
follows the semantics of NUMA_BALANCING_NORMAL before introducing
NUMA_BALANCING_MEMORY_TIERING.

---
Best Regards,
Huang, Ying