[PATCH 7/8 RFC] mm/memcontrol: optimize stock usage for cgroup v2
From: Joshua Hahn
Date: Fri Apr 10 2026 - 17:13:03 EST
In cgroup v2, tasks can only belong to leaf cgroups, meaning non-leaf
cgroups never receive direct charges. Having stock remain in these
cgroups therefore, is wasted percpu memory that will never be consumed
unless all of its children are removed.
To avoid leaving unused but accounted charges from remaining in non-leaf
cgroups, drain the stock when leaf cgroups become parents.
There is one caveat, which is concurrent charging and child creation.
When a leaf cgroup becomes a parent at the same time it is still
charging a task, there can be a race condition where the parent's
stock is drained, then refilled by the charge.
Instead of adding expensive synchronization mechanisms, accept the
pages kept captive by parent page_counters which will not be able to use
the stock until all its children are offlined first. It is a rare
race condition, and is also bounded by MEMCG_CHARGE_BATCH = 64 pages.
This optimization is not for cgroup v1, where tasks can be attached to
any cgroup in the hierarchy, meaning stock can be consumed & refilled
for non-leaf cgroups as well.
Suggested-by: Johannes Weiner <hannes@xxxxxxxxxxx>
Signed-off-by: Joshua Hahn <joshua.hahnjy@xxxxxxxxx>
---
mm/memcontrol.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 6d50f5d667434..4be1638dde180 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4130,6 +4130,17 @@ static int mem_cgroup_css_online(struct cgroup_subsys_state *css)
*/
xa_store(&mem_cgroup_private_ids, memcg->id.id, memcg, GFP_KERNEL);
+ /*
+ * On v2, non-leaf memcgs cannot directly be charged. This child's
+ * parent is no longer a leaf, so drain the parent's stock.
+ */
+ if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
+ struct mem_cgroup *parent = parent_mem_cgroup(memcg);
+
+ if (parent)
+ page_counter_drain_stock(&parent->memory);
+ }
+
return 0;
free_objcg:
for_each_node(nid) {
--
2.52.0