[PATCH 6/8 RFC] mm/memcontrol: optimize memsw stock for cgroup v1

From: Joshua Hahn

Date: Fri Apr 10 2026 - 17:12:30 EST


Previously, each memcg had its own stock, which was shared by all page
counters within it. Specifically in try_charge_memcg, the stock limit
check would occur before the memsw and memory page_counters were
charged hierarchically.

Now that the memcg stock was folded into the page_counter level, and we
have replaced try_charge_memcg's stock check against the memory
page_counter's stock, this leaves no fast path available for cgroup v1's
memsw check.

Introduce a new stock for the memsw page_counter, charged and uncharged
independently from the memory page_counter. This provides better caching
on cgroup v1:

The best case scenario is when both the memsw and memory page_counters
can use their cached stock charge; this is the old behavior.

The halfway scenario is when either the memsw or memory page_counter
is within the stock size, but the other isn't. This requires one
hierarchical charge.

The worst case scenario is when both memsw and memory page_counters
are over their limit, and must walk two page_counter hierarchies. This
is the same as the old behavior.

By introducing an indepednent stock for memsw, we can avoid the worst
case scenario more often and can fail or succeed separately from the
memory page counter.

Suggested-by: Johannes Weiner <hannes@xxxxxxxxxxx>
Signed-off-by: Joshua Hahn <joshua.hahnjy@xxxxxxxxx>
---
mm/memcontrol.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 27d2edd5a7832..6d50f5d667434 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2245,8 +2245,10 @@ void drain_all_stock(struct mem_cgroup *root_memcg)
if (!mutex_trylock(&percpu_charge_mutex))
return;

- for_each_mem_cgroup_tree(memcg, root_memcg)
+ for_each_mem_cgroup_tree(memcg, root_memcg) {
page_counter_drain_stock(&memcg->memory);
+ page_counter_drain_stock(&memcg->memsw);
+ }

/* Drain obj_stock on all online CPUs */
migrate_disable();
@@ -2275,8 +2277,10 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu)
/* no need for the local lock */
drain_obj_stock(&per_cpu(obj_stock, cpu));

- for_each_mem_cgroup(memcg)
+ for_each_mem_cgroup(memcg) {
page_counter_drain_cpu(&memcg->memory, cpu);
+ page_counter_drain_cpu(&memcg->memsw, cpu);
+ }

return 0;
}
@@ -4111,6 +4115,8 @@ static int mem_cgroup_css_online(struct cgroup_subsys_state *css)

/* failure is nonfatal, charges fall back to direct hierarchy */
page_counter_enable_stock(&memcg->memory, MEMCG_CHARGE_BATCH);
+ if (do_memsw_account())
+ page_counter_enable_stock(&memcg->memsw, MEMCG_CHARGE_BATCH);

/*
* Ensure mem_cgroup_from_private_id() works once we're fully online.
@@ -4175,6 +4181,7 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)

drain_all_stock(memcg);
page_counter_disable_stock(&memcg->memory);
+ page_counter_disable_stock(&memcg->memsw);

mem_cgroup_private_id_put(memcg, 1);
}
--
2.52.0