[PATCH v3 0/4] memcg: shrink obj_stock_pcp and cache multiple objcgs

From: Shakeel Butt

Date: Mon May 25 2026 - 23:40:09 EST


Commit 01b9da291c49 ("mm: memcontrol: convert objcg to be per-memcg
per-node type") split a memcg's single obj_cgroup into one per NUMA
node so that reparenting LRU folios can take per-node lru locks. As a
side effect, the per-CPU obj_stock_pcp -- which caches a single
cached_objcg pointer -- thrashes on workloads where threads of the
same memcg run on different NUMA nodes. The kernel test robot reported
a 67.7% regression on stress-ng.switch.ops_per_sec from this pattern.

Commit d0211878ce06 ("memcg: cache obj_stock by memcg, not by objcg
pointer") landed as a temporary fix by treating sibling per-node
objcgs as equivalent for the cache lookup, intended to be reverted
once per-node kmem accounting is introduced. This series takes a more
general approach: cache multiple objcgs per CPU using the multi-slot
pattern memcg_stock_pcp already uses, so the per-node objcg variants
of one memcg can all coexist in the stock without ever forcing a
drain. The temporary fix can then be reverted.

To avoid increasing the per-CPU cache footprint, the first three
patches shrink the existing single-slot obj_stock_pcp fields.
The final patch converts cached_objcg and nr_bytes into
NR_OBJ_STOCK=5 slot arrays and reorders the struct so the entire
consume/refill/account hot path fits within a single 64-byte cache
line on non-debug 64-bit builds (verified with pahole).

Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
Closes: https://lore.kernel.org/oe-lkp/202605121641.b6a60cb0-lkp@xxxxxxxxx
Fixes: 01b9da291c49 ("mm: memcontrol: convert objcg to be per-memcg per-node type")
Tested-by: kernel test robot <oliver.sang@xxxxxxxxx>

Shakeel Butt (4):
memcg: store node_id instead of pglist_data pointer
memcg: uint16_t for nr_bytes in obj_stock_pcp
memcg: int16_t for cached slab stats
memcg: multi objcg charge support

mm/memcontrol.c | 214 +++++++++++++++++++++++++++++++++++-------------
1 file changed, 157 insertions(+), 57 deletions(-)

--

Changes since v2:
http://lore.kernel.org/20260522011908.1669332-1-shakeel.butt@xxxxxxxxx

- Fix comments (Muchun & Qi)
- Simplify code (David Laight)
- Fix handling of archs with base page size larger than 256 KiB (Sashiko)

Changes since v1:
http://lore.kernel.org/20260520053123.2709959-1-shakeel.butt@xxxxxxxxx

- Collected review tags (Harry & Muchun)
- Fix comparison operators (Harry)
- Use round robin for drain

2.53.0-Meta