Re: [PATCH v4 26/31] mm: vmscan: prepare for reparenting MGLRU folios

From: Qi Zheng

Date: Sun Feb 15 2026 - 02:29:28 EST




On 2/12/26 4:46 PM, Harry Yoo wrote:
On Thu, Feb 05, 2026 at 05:01:45PM +0800, Qi Zheng wrote:
From: Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx>

Similar to traditional LRU folios, in order to solve the dying memcg
problem, we also need to reparenting MGLRU folios to the parent memcg when
memcg offline.

However, there are the following challenges:

1. Each lruvec has between MIN_NR_GENS and MAX_NR_GENS generations, the
number of generations of the parent and child memcg may be different,
so we cannot simply transfer MGLRU folios in the child memcg to the
parent memcg as we did for traditional LRU folios.
2. The generation information is stored in folio->flags, but we cannot
traverse these folios while holding the lru lock, otherwise it may
cause softlockup.
3. In walk_update_folio(), the gen of folio and corresponding lru size
may be updated, but the folio is not immediately moved to the
corresponding lru list. Therefore, there may be folios of different
generations on an LRU list.
4. In lru_gen_del_folio(), the generation to which the folio belongs is
found based on the generation information in folio->flags, and the
corresponding LRU size will be updated. Therefore, we need to update
the lru size correctly during reparenting, otherwise the lru size may
be updated incorrectly in lru_gen_del_folio().

Finally, this patch chose a compromise method, which is to splice the lru
list in the child memcg to the lru list of the same generation in the
parent memcg during reparenting. And in order to ensure that the parent
memcg has the same generation, we need to increase the generations in the
parent memcg to the MAX_NR_GENS before reparenting.

Of course, the same generation has different meanings in the parent and
child memcg, this will cause confusion in the hot and cold information of
folios. But other than that, this method is simple enough, the lru size
is correct, and there is no need to consider some concurrency issues (such
as lru_gen_del_folio()).

To prepare for the above work, this commit implements the specific
functions, which will be used during reparenting.

Suggested-by: Harry Yoo <harry.yoo@xxxxxxxxxx>
Suggested-by: Imran Khan <imran.f.khan@xxxxxxxxxx>
Signed-off-by: Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx>
---
include/linux/mmzone.h | 16 +++++
mm/vmscan.c | 154 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 170 insertions(+)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 3e51190a55e4c..0c18b17f0fe2e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
diff --git a/mm/vmscan.c b/mm/vmscan.c
index e2d9ef9a5dedc..8c6f8f0df24b1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
+void lru_gen_reparent_memcg(struct mem_cgroup *memcg, struct mem_cgroup *parent)
+{
+ int nid;
+
+ for_each_node(nid) {
+ struct lruvec *child_lruvec, *parent_lruvec;
+ int type, zid;
+ struct zone *zone;
+ enum lru_list lru;
+
+ child_lruvec = get_lruvec(memcg, nid);
+ parent_lruvec = get_lruvec(parent, nid);
+
+ for_each_managed_zone_pgdat(zone, NODE_DATA(nid), zid, MAX_NR_ZONES - 1)
+ for (type = 0; type < ANON_AND_FILE; type++)
+ __lru_gen_reparent_memcg(child_lruvec, parent_lruvec, zid, type);
+
+ for_each_lru(lru) {
+ for_each_managed_zone_pgdat(zone, NODE_DATA(nid), zid, MAX_NR_ZONES - 1) {
+ unsigned long size = mem_cgroup_get_zone_lru_size(child_lruvec, lru, zid);
+
+ mem_cgroup_update_lru_size(parent_lruvec, lru, zid, size);

This part looks fine, but I think the nr_pages parameter
in mem_cgroup_update_lru_size() should be long instead of int.
Could you please update the type as well?

Make sense, and I think it would be better to do this by sending
a separate patch, will do that and add your Suggested-by.


Otherwise looks good to me,
Acked-by: Harry Yoo <harry.yoo@xxxxxxxxxx>

Thanks!


+ }
+ }
+ }
+}
+
#endif /* CONFIG_MEMCG */