Re: [PATCH v2 27/28] mm: memcontrol: eliminate the problem of dying memory cgroup for LRU folios

From: Qi Zheng

Date: Sun Dec 21 2025 - 23:00:01 EST




On 12/18/25 10:06 PM, Johannes Weiner wrote:
On Wed, Dec 17, 2025 at 03:27:51PM +0800, Qi Zheng wrote:
From: Muchun Song <songmuchun@xxxxxxxxxxxxx>

Pagecache pages are charged at allocation time and hold a reference
to the original memory cgroup until reclaimed. Depending on memory
pressure, page sharing patterns between different cgroups and cgroup
creation/destruction rates, many dying memory cgroups can be pinned
by pagecache pages, reducing page reclaim efficiency and wasting
memory. Converting LRU folios and most other raw memory cgroup pins
to the object cgroup direction can fix this long-living problem.

This is already in the coverletter. Please describe here what the
patch itself does. IOW, now that everything is set up, switch
folio->memcg_data pointers to objcgs, update the accessors, and
execute reparenting on cgroup death.

Got it, will do.


Finally, folio->memcg_data of LRU folios and kmem folios will always
point to an object cgroup pointer. The folio->memcg_data of slab
folios will point to an vector of object cgroups.

@@ -223,22 +223,55 @@ static inline void __memcg_reparent_objcgs(struct mem_cgroup *src,
static inline void reparent_locks(struct mem_cgroup *src, struct mem_cgroup *dst)
{
+ int nid, nest = 0;
+
spin_lock_irq(&objcg_lock);
+ for_each_node(nid) {
+ spin_lock_nested(&mem_cgroup_lruvec(src,
+ NODE_DATA(nid))->lru_lock, nest++);
+ spin_lock_nested(&mem_cgroup_lruvec(dst,
+ NODE_DATA(nid))->lru_lock, nest++);
+ }
}

Looks okay to me. If this should turn out to be a scalability problem
in practice, we can make objcgs per-node, and then reparent lru/objcg
pairs on a per-node basis without nesting locks.

static inline void reparent_unlocks(struct mem_cgroup *src, struct mem_cgroup *dst)
{
+ int nid;
+
+ for_each_node(nid) {
+ spin_unlock(&mem_cgroup_lruvec(dst, NODE_DATA(nid))->lru_lock);
+ spin_unlock(&mem_cgroup_lruvec(src, NODE_DATA(nid))->lru_lock);
+ }
spin_unlock_irq(&objcg_lock);
}
+static void memcg_reparent_lru_folios(struct mem_cgroup *src,
+ struct mem_cgroup *dst)
+{
+ if (lru_gen_enabled())
+ lru_gen_reparent_memcg(src, dst);
+ else
+ lru_reparent_memcg(src, dst);
+}
+
static void memcg_reparent_objcgs(struct mem_cgroup *src)
{
struct obj_cgroup *objcg = rcu_dereference_protected(src->objcg, true);
struct mem_cgroup *dst = parent_mem_cgroup(src);
+retry:
+ if (lru_gen_enabled())
+ max_lru_gen_memcg(dst);
+
reparent_locks(src, dst);
+ if (lru_gen_enabled() && !recheck_lru_gen_max_memcg(dst)) {
+ reparent_unlocks(src, dst);
+ cond_resched();
+ goto retry;
+ }
__memcg_reparent_objcgs(src, dst);
+ memcg_reparent_lru_folios(src, dst);

Please inline memcg_reparent_lru_folios() here, to keep the lru vs
lrugen switching as "flat" as possible:

if (lru_gen_enabled()) {
if (!recheck_lru_gen_max_memcgs(parent)) {
reparent_unlocks(memcg, parent);
cond_resched();
goto retry;
}
lru_gen_reparent_memcg(memcg, parent);
} else {
lru_reparent_memcg(memcg, parent);
}

Looks better, will change to this style.


@@ -989,6 +1022,8 @@ struct mem_cgroup *get_mem_cgroup_from_current(void)
/**
* get_mem_cgroup_from_folio - Obtain a reference on a given folio's memcg.
* @folio: folio from which memcg should be extracted.
+ *
+ * The folio and objcg or memcg binding rules can refer to folio_memcg().

See folio_memcg() for folio->objcg/memcg binding rules.

OK, will do.