Re: [PATCH v4 26/31] mm: vmscan: prepare for reparenting MGLRU folios
From: Harry Yoo
Date: Thu Feb 12 2026 - 03:48:17 EST
On Thu, Feb 05, 2026 at 05:01:45PM +0800, Qi Zheng wrote:
> From: Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx>
>
> Similar to traditional LRU folios, in order to solve the dying memcg
> problem, we also need to reparenting MGLRU folios to the parent memcg when
> memcg offline.
>
> However, there are the following challenges:
>
> 1. Each lruvec has between MIN_NR_GENS and MAX_NR_GENS generations, the
> number of generations of the parent and child memcg may be different,
> so we cannot simply transfer MGLRU folios in the child memcg to the
> parent memcg as we did for traditional LRU folios.
> 2. The generation information is stored in folio->flags, but we cannot
> traverse these folios while holding the lru lock, otherwise it may
> cause softlockup.
> 3. In walk_update_folio(), the gen of folio and corresponding lru size
> may be updated, but the folio is not immediately moved to the
> corresponding lru list. Therefore, there may be folios of different
> generations on an LRU list.
> 4. In lru_gen_del_folio(), the generation to which the folio belongs is
> found based on the generation information in folio->flags, and the
> corresponding LRU size will be updated. Therefore, we need to update
> the lru size correctly during reparenting, otherwise the lru size may
> be updated incorrectly in lru_gen_del_folio().
>
> Finally, this patch chose a compromise method, which is to splice the lru
> list in the child memcg to the lru list of the same generation in the
> parent memcg during reparenting. And in order to ensure that the parent
> memcg has the same generation, we need to increase the generations in the
> parent memcg to the MAX_NR_GENS before reparenting.
>
> Of course, the same generation has different meanings in the parent and
> child memcg, this will cause confusion in the hot and cold information of
> folios. But other than that, this method is simple enough, the lru size
> is correct, and there is no need to consider some concurrency issues (such
> as lru_gen_del_folio()).
>
> To prepare for the above work, this commit implements the specific
> functions, which will be used during reparenting.
>
> Suggested-by: Harry Yoo <harry.yoo@xxxxxxxxxx>
> Suggested-by: Imran Khan <imran.f.khan@xxxxxxxxxx>
> Signed-off-by: Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx>
> ---
> include/linux/mmzone.h | 16 +++++
> mm/vmscan.c | 154 +++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 170 insertions(+)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 3e51190a55e4c..0c18b17f0fe2e 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index e2d9ef9a5dedc..8c6f8f0df24b1 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> +void lru_gen_reparent_memcg(struct mem_cgroup *memcg, struct mem_cgroup *parent)
> +{
> + int nid;
> +
> + for_each_node(nid) {
> + struct lruvec *child_lruvec, *parent_lruvec;
> + int type, zid;
> + struct zone *zone;
> + enum lru_list lru;
> +
> + child_lruvec = get_lruvec(memcg, nid);
> + parent_lruvec = get_lruvec(parent, nid);
> +
> + for_each_managed_zone_pgdat(zone, NODE_DATA(nid), zid, MAX_NR_ZONES - 1)
> + for (type = 0; type < ANON_AND_FILE; type++)
> + __lru_gen_reparent_memcg(child_lruvec, parent_lruvec, zid, type);
> +
> + for_each_lru(lru) {
> + for_each_managed_zone_pgdat(zone, NODE_DATA(nid), zid, MAX_NR_ZONES - 1) {
> + unsigned long size = mem_cgroup_get_zone_lru_size(child_lruvec, lru, zid);
> +
> + mem_cgroup_update_lru_size(parent_lruvec, lru, zid, size);
This part looks fine, but I think the nr_pages parameter
in mem_cgroup_update_lru_size() should be long instead of int.
Could you please update the type as well?
Otherwise looks good to me,
Acked-by: Harry Yoo <harry.yoo@xxxxxxxxxx>
> + }
> + }
> + }
> +}
> +
> #endif /* CONFIG_MEMCG */
--
Cheers,
Harry / Hyeonggon