Re: [PATCH v5 29/32] mm: memcontrol: prepare for reparenting non-hierarchical stats

From: Qi Zheng

Date: Fri Feb 27 2026 - 22:45:36 EST




On 2/28/26 2:18 AM, Yosry Ahmed wrote:
[..]
@@ -506,12 +517,10 @@ void reparent_memcg_lruvec_state_local(struct
mem_cgroup *memcg,
for_each_node(nid) {
struct lruvec *child_lruvec = mem_cgroup_lruvec(memcg,
NODE_DATA(nid));
struct lruvec *parent_lruvec =
mem_cgroup_lruvec(parent, NODE_DATA(nid));
- struct mem_cgroup_per_node *parent_pn;
unsigned long value =
lruvec_page_state_local(child_lruvec, idx);

- parent_pn = container_of(parent_lruvec, struct
mem_cgroup_per_node, lruvec);
-
- atomic_long_add(value,
&(parent_pn->lruvec_stats->state_local[i]));
+ mod_memcg_lruvec_state(child_lruvec, idx, -value);

We can't use mod_memcg_lruvec_state() here, because child memcg has
already been set CSS_DYING. So in mod_memcg_lruvec_state(), we will
get parent memcg.

It seems we need to reimplement a function or add a parameter to
mod_memcg_lruvec_state() to solve the problem. What do you think?

Since child memcg is about to disappear, perhaps we can just add value
to parent memcg without handling the child memcg. Make sense?

Ugh yes, I missed that, thanks.

I don't think we can just leave the child's memcg wrong. Aside from
the fact that I would be nervous if access to those stats is still
possible after it's offlined (e.g. can userspace already have the
stats file open, or maybe some in-kernel code uses it), there's a
bigger issue.

When the child cgroup is released, css_release_work_fn() will flush
its stats and then it will be double counted at the parent.

Maybe refactor the part sof mod_memcg_lruvec_state() and
mod_memcg_state () without get_non_dying_memcg_{start/end}() into
helpers, and call that directly from the reparenting functions? Adding

OK, will do.

a boolean argument to mod_memcg_lruvec_state() and mod_memcg_state()
will add a lot of churn, and naked boolean arguments are not ideal.