Re: (sashiko review) [RFC PATCH 2/2] mm/damon/core: validate damos_quota_goal->nid for node_memcg_{used,free}_bp

From: SeongJae Park

Date: Fri Mar 27 2026 - 22:32:05 EST


Forwarding Sashiko review for doing discussions via mails.

# review url: https://sashiko.dev/#/patchset/20260328005412.7606-3-sj@xxxxxxxxxx
# start of sashiko.dev inline review
commit d10719045bbbd92307cf40ef0886ce0e81cb9caa
Author: SeongJae Park <sj@xxxxxxxxxx>

mm/damon/core: validate damos_quota_goal->nid for node_memcg_{used,free}_bp

This commit adds validation for the damos_quota_goal->nid when calculating
node memory cgroup usage ratios. If an invalid node ID is provided, it
returns 0% for used memory and 100% for free memory to prevent out-of-bounds
memory access during NODE_DATA() lookups.

> diff --git a/mm/damon/core.c b/mm/damon/core.c
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
> @@ -2273,6 +2273,13 @@ static unsigned long damos_get_node_memcg_used_bp(
> unsigned long used_pages, numerator;
> struct sysinfo i;
>
> + if (invalid_mem_node(goal->nid)) {
> + if (goal->metric == DAMOS_QUOTA_NODE_MEMCG_USED_BP)
> + return 0;
> + else /* DAMOS_QUOTA_NODE_MEM_FREE_BP */

This isn't a bug, but there is a typo in the comment here. It says
DAMOS_QUOTA_NODE_MEM_FREE_BP instead of DAMOS_QUOTA_NODE_MEMCG_FREE_BP.

> + return 10000;
> + }
> +

[ ... ]

> si_meminfo_node(&i, goal->nid);
> if (goal->metric == DAMOS_QUOTA_NODE_MEMCG_USED_BP)
> numerator = used_pages;
> else /* DAMOS_QUOTA_NODE_MEMCG_FREE_BP */
> numerator = i.totalram - used_pages;
> return mult_frac(numerator, 10000, i.totalram);
> }

Does this code safely handle nodes with zero managed pages?

While the new invalid_mem_node() check ensures the node has N_MEMORY set, a
node could still have zero managed pages if all its memory is reserved, or
if a concurrent hot-remove event drops the pages to zero just before
si_meminfo_node() runs. If i.totalram evaluates to 0, mult_frac() will
cause a divide-by-zero kernel panic.

Can this calculation also silently overflow on 32-bit architectures?

Both used_pages and i.totalram are unsigned long. Since numerator <=
i.totalram, mult_frac() bypasses the initial integer division and computes
numerator * 10000 directly. If the memory cgroup uses more than ~1.7GB
(about 429,496 4KB pages), numerator * 10000 will exceed the maximum
32-bit unsigned long value, resulting in an incorrect ratio.


# end of sashiko.dev inline review
# review url: https://sashiko.dev/#/patchset/20260328005412.7606-3-sj@xxxxxxxxxx
#
# hkml [1] generated a draft of this mail. It can be regenerated
# using below command:
#
# hkml patch sashiko_dev --for_forwarding \
# 20260328005412.7606-3-sj@xxxxxxxxxx
#
# [1] https://github.com/sjp38/hackermail

Sent using hkml (https://github.com/sjp38/hackermail)