Re: (sashiko review) [PATCH 2/2] mm/damon/core: validate damos_quota_goal->nid for node_memcg_{used,free}_bp

From: SeongJae Park

Date: Sun Mar 29 2026 - 11:38:09 EST


Forwarding Sashiko review for doing discussions via mails.

# review url: https://sashiko.dev/#/patchset/20260329043902.46163-3-sj@xxxxxxxxxx
# start of sashiko.dev inline review
commit 1ede2d837f5cbf8e8d96fe4c6eab34aeaa071d56
Author: SeongJae Park <sj@xxxxxxxxxx>

mm/damon/core: validate damos_quota_goal->nid for node_memcg_{used,free}_bp

This commit adds validation for the node id in damos_get_node_memcg_used_bp()
to prevent out-of-bounds memory access. If an invalid node id is provided, it
now safely returns 0% for used memory and 100% for free memory.

> diff --git a/mm/damon/core.c b/mm/damon/core.c
> index 26bdeb658799..c45c62f8449d 100644
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
> @@ -2273,6 +2273,13 @@ static unsigned long damos_get_node_memcg_used_bp(
> unsigned long used_pages, numerator;
> struct sysinfo i;
>
> + if (invalid_mem_node(goal->nid)) {
> + if (goal->metric == DAMOS_QUOTA_NODE_MEMCG_USED_BP)
> + return 0;
> + else /* DAMOS_QUOTA_NODE_MEMCG_FREE_BP */
> + return 10000;
> + }
> +
> memcg = mem_cgroup_get_from_id(goal->memcg_id);
> if (!memcg) {
> if (goal->metric == DAMOS_QUOTA_NODE_MEMCG_USED_BP)

This isn't a bug introduced by this patch, but can the subtraction later in
this function underflow?

damos_get_node_memcg_used_bp() {
...
else /* DAMOS_QUOTA_NODE_MEMCG_FREE_BP */
numerator = i.totalram - used_pages;
...
}

Since used_pages is fetched locklessly and i.totalram is the global managed
pages count, could asynchronous rstat accounting lag or concurrent memory
hot-remove operations cause used_pages to transiently exceed i.totalram?

Also regarding the end of this function, is it possible for i.totalram to
evaluate to 0 here?

damos_get_node_memcg_used_bp() {
...
return mult_frac(numerator, 10000, i.totalram);
}

If a node's memory is entirely reserved or concurrently offlined,
si_meminfo_node() might set i.totalram to 0. If that happens, does
mult_frac() execute an unchecked division by zero internally?

Finally, does the mult_frac() macro overflow unsigned long on 32-bit
architectures?

The macro evaluates to q * n_ + r * n_ / d_, where r is
numerator % i.totalram. On 32-bit systems, r is a 32-bit unsigned long.

For nodes with more than ~1.7GB of RAM (where i.totalram exceeds ~429,496
pages), could r * 10000 silently overflow the 32-bit boundary and return a
severely truncated value?


# end of sashiko.dev inline review
# review url: https://sashiko.dev/#/patchset/20260329043902.46163-3-sj@xxxxxxxxxx
#
# hkml [1] generated a draft of this mail. It can be regenerated
# using below command:
#
# hkml patch sashiko_dev --for_forwarding \
# 20260329043902.46163-3-sj@xxxxxxxxxx
#
# [1] https://github.com/sjp38/hackermail

Sent using hkml (https://github.com/sjp38/hackermail)