Re: [PATCH v2 00/28] Eliminate Dying Memory Cgroup

From: Qi Zheng

Date: Mon Dec 29 2025 - 04:43:47 EST

On 12/24/25 8:58 AM, Shakeel Butt wrote:

On Wed, Dec 24, 2025 at 12:43:00AM +0000, Yosry Ahmed wrote:
[...]

I think you meant child's memcg here.

Yes, sorry.

before
reparenting, and using it to update the stats after reparenting? A grace
period only works if the entire scope of using the memcg is within the
RCU critical section.

Yeah this is an issue.

For example, __mem_cgroup_try_charge_swap() currently does this when
incrementing MEMCG_SWAP. While this specific example isn't problematic
because the reference won't be dropped until MEMCG_SWAP is decremented
again, the pattern of grabbing a ref to the memcg then updating a stat
could generally cause the problem.

Most stats are updated using lruvec_stat_mod_folio(), which updates the
stats in the same RCU critical section as obtaining the memcg pointer
from the folio, so it can be fixed with a grace period. However, I think
it can be easily missed in the future if other code paths update memcg
stats in a different way. We should try to enforce that stat updates
cannot only happen from the same RCU critical section where the memcg
pointer is acquired.

The core stats update functions are mod_memcg_state() and
mod_memcg_lruvec_state(). If for v1 only, we add additional check for
CSS_DYING and go to parent if CSS_DYING is set then shouldn't we avoid
this issue?

But this is still racy, right? The cgroup could become dying right after
we check CSS_DYING, no?

We do reparenting in css_offline() callback and cgroup offlining
happen somewhat like this:

1. Set CSS_DYING
2. Trigger percpu ref kill
3. Kernel makes sure css ref killed is seen by all CPUs and then trigger
css_offline callback.

it seems that we can add the following to
mem_cgroup_css_free():

parent->vmstats->state_local += child->vmstats->state_local;

Right? I will continue to take a closer look.

Thanks,
Qi

So, if in the stats update function we check CSS_DYING flag and the
actual stats update within rcu, I think we are good.