Re: [PATCH v3 08/30] mm: memcontrol: prevent memory cgroup release in get_mem_cgroup_from_folio()

From: Qi Zheng

Date: Sun Jan 18 2026 - 22:20:46 EST

On 1/18/26 8:31 AM, Shakeel Butt wrote:

On Wed, Jan 14, 2026 at 07:32:35PM +0800, Qi Zheng wrote:

From: Muchun Song <songmuchun@xxxxxxxxxxxxx>

In the near future, a folio will no longer pin its corresponding
memory cgroup. To ensure safety, it will only be appropriate to
hold the rcu read lock or acquire a reference to the memory cgroup
returned by folio_memcg(), thereby preventing it from being released.

In the current patch, the rcu read lock is employed to safeguard
against the release of the memory cgroup in get_mem_cgroup_from_folio().

This serves as a preparatory measure for the reparenting of the
LRU pages.

Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>
Signed-off-by: Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx>
Reviewed-by: Harry Yoo <harry.yoo@xxxxxxxxxx>
---
mm/memcontrol.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 982c9f5cf72cb..0458fc2e810ff 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -991,14 +991,18 @@ struct mem_cgroup *get_mem_cgroup_from_current(void)
*/
struct mem_cgroup *get_mem_cgroup_from_folio(struct folio *folio)
{
- struct mem_cgroup *memcg = folio_memcg(folio);
+ struct mem_cgroup *memcg;
if (mem_cgroup_disabled())
return NULL;
+ if (!folio_memcg_charged(folio))
+ return root_mem_cgroup;
+
rcu_read_lock();
- if (!memcg || WARN_ON_ONCE(!css_tryget(&memcg->css)))
- memcg = root_mem_cgroup;
+ do {
+ memcg = folio_memcg(folio);
+ } while (unlikely(!css_tryget(&memcg->css)));

I went back to [1] where AI raised the following concern which I want to
address:

If css_tryget() fails (e.g. refcount is 0), this loop spins indefinitely
with the RCU read lock held. Is it guaranteed that folio_memcg() will
return a different, alive memcg in subsequent iterations?

Will css_tryget() ever fail for the memcg returned by folio_memcg()?
Let's suppose memcg of a given folio is being offlined. The objcg
reparenting happens in memcg_reparent_objcgs() which is called in
offline_css() chain and we know that the offline context holds a
reference on the css being offlined (see css_killed_work_fn()).

Also let's suppose the offline process has the last reference on the
memcg's css. Now we have following two scenarios:

Scenario 1:

get_mem_cgroup_from_folio() css_killed_work_fn()
memcg = folio_memcg(folio) offline_css(css)
memcg_reparent_objcgs()
css_tryget(memcg)
css_put(css)

In the above case css_tryget() will not fail.

Scenario 2:

get_mem_cgroup_from_folio() css_killed_work_fn()
memcg = folio_memcg(folio) offline_css(css)
memcg_reparent_objcgs()
css_put(css) // last reference
css_tryget(memcg)
// retry on failure

In the above case the context in get_mem_cgroup_from_folio() will retry
and will get different memcg during reparenting happening before the
last css_put(css).

So, I think we are good and AI is mistaken.

Folks, please check if I missed something.

LGTM, thank you for such a detailed analysis!

If the folio is isolated (e.g. via migrate_misplaced_folio()), it might be
missed by reparenting logic that iterates LRU lists.

LRU isolation will not impact reparenting logic, so we can discount this
as well.

In that case, the
folio would continue pointing to the dying memcg, leading to a hard lockup.

Also, folio_memcg() calls __folio_memcg(), which reads folio->memcg_data
without READ_ONCE().

Oh I think I know why AI is confused. It is because it is looking at
folio->memcg i.e. state with this patch only and not the state after the
series. In the current state the folio holds the reference on memcg, so
css_tryget() will never fail.

Since this loop waits for memcg_data to be updated
by another CPU (reparenting), could the compiler hoist the load out of
the loop, preventing the update from being seen?

Finally, the previous code fell back to root_mem_cgroup on failure. Is it
safe to remove that fallback? If css_tryget() fails unexpectedly, hanging
seems more severe than the previous behavior of warning and falling back.

[1] https://lore.kernel.org/all/7ia4ldikrbsj.fsf@xxxxxxxxxxxxxxxxxxxxx/