Re: (sashiko review) [RFC PATCH 2/2] mm/damon/core: validate damos_quota_goal->nid for node_memcg_{used,free}_bp
From: SeongJae Park
Date: Sat Mar 28 2026 - 09:43:51 EST
On Fri, 27 Mar 2026 19:30:20 -0700 SeongJae Park <sj@xxxxxxxxxx> wrote:
> Forwarding Sashiko review for doing discussions via mails.
>
> # review url: https://sashiko.dev/#/patchset/20260328005412.7606-3-sj@xxxxxxxxxx
> # start of sashiko.dev inline review
> commit d10719045bbbd92307cf40ef0886ce0e81cb9caa
> Author: SeongJae Park <sj@xxxxxxxxxx>
>
> mm/damon/core: validate damos_quota_goal->nid for node_memcg_{used,free}_bp
>
> This commit adds validation for the damos_quota_goal->nid when calculating
> node memory cgroup usage ratios. If an invalid node ID is provided, it
> returns 0% for used memory and 100% for free memory to prevent out-of-bounds
> memory access during NODE_DATA() lookups.
>
> > diff --git a/mm/damon/core.c b/mm/damon/core.c
> > --- a/mm/damon/core.c
> > +++ b/mm/damon/core.c
> > @@ -2273,6 +2273,13 @@ static unsigned long damos_get_node_memcg_used_bp(
> > unsigned long used_pages, numerator;
> > struct sysinfo i;
> >
> > + if (invalid_mem_node(goal->nid)) {
> > + if (goal->metric == DAMOS_QUOTA_NODE_MEMCG_USED_BP)
> > + return 0;
> > + else /* DAMOS_QUOTA_NODE_MEM_FREE_BP */
>
> This isn't a bug, but there is a typo in the comment here. It says
> DAMOS_QUOTA_NODE_MEM_FREE_BP instead of DAMOS_QUOTA_NODE_MEMCG_FREE_BP.
Good catch, I will fix this in the next spin.
>
> > + return 10000;
> > + }
> > +
>
> [ ... ]
>
> > si_meminfo_node(&i, goal->nid);
> > if (goal->metric == DAMOS_QUOTA_NODE_MEMCG_USED_BP)
> > numerator = used_pages;
> > else /* DAMOS_QUOTA_NODE_MEMCG_FREE_BP */
> > numerator = i.totalram - used_pages;
> > return mult_frac(numerator, 10000, i.totalram);
> > }
>
> Does this code safely handle nodes with zero managed pages?
>
> While the new invalid_mem_node() check ensures the node has N_MEMORY set, a
> node could still have zero managed pages if all its memory is reserved, or
> if a concurrent hot-remove event drops the pages to zero just before
> si_meminfo_node() runs. If i.totalram evaluates to 0, mult_frac() will
> cause a divide-by-zero kernel panic.
>
> Can this calculation also silently overflow on 32-bit architectures?
>
> Both used_pages and i.totalram are unsigned long. Since numerator <=
> i.totalram, mult_frac() bypasses the initial integer division and computes
> numerator * 10000 directly.
To my understanding, mult_frac() doesn't do such bypass. Correct me if I'm
wrong.
> If the memory cgroup uses more than ~1.7GB
> (about 429,496 4KB pages), numerator * 10000 will exceed the maximum
> 32-bit unsigned long value, resulting in an incorrect ratio.
So this concern from sashiko is wrong.
Thanks,
SJ
[...]