Re: [PATCH v2 0/4] memcg: shrink obj_stock_pcp and cache multiple objcgs
From: Shakeel Butt
Date: Mon May 25 2026 - 14:54:11 EST
On Fri, May 22, 2026 at 07:34:40PM -0700, Andrew Morton wrote:
> On Thu, 21 May 2026 18:19:04 -0700 Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote:
>
> > Commit 01b9da291c49 ("mm: memcontrol: convert objcg to be per-memcg
> > per-node type") split a memcg's single obj_cgroup into one per NUMA
> > node so that reparenting LRU folios can take per-node lru locks. As a
> > side effect, the per-CPU obj_stock_pcp -- which caches a single
> > cached_objcg pointer -- thrashes on workloads where threads of the
> > same memcg run on different NUMA nodes. The kernel test robot reported
> > a 67.7% regression on stress-ng.switch.ops_per_sec from this pattern.
> >
> > Commit d0211878ce06 ("memcg: cache obj_stock by memcg, not by objcg
> > pointer") landed as a temporary fix by treating sibling per-node
> > objcgs as equivalent for the cache lookup, intended to be reverted
> > once per-node kmem accounting is introduced. This series takes a more
> > general approach: cache multiple objcgs per CPU using the multi-slot
> > pattern memcg_stock_pcp already uses, so the per-node objcg variants
> > of one memcg can all coexist in the stock without ever forcing a
> > drain. The temporary fix can then be reverted.
> >
> > To avoid increasing the per-CPU cache footprint, the first three
> > patches shrink the existing single-slot obj_stock_pcp fields.
> > The final patch converts cached_objcg and nr_bytes into
> > NR_OBJ_STOCK=5 slot arrays and reorders the struct so the entire
> > consume/refill/account hot path fits within a single 64-byte cache
> > line on non-debug 64-bit builds (verified with pahole).
>
> Thanks, I added this to mm.git's mm-new branch, along with a couple of
> possible todo notes from the review.
>
> Sashiko asked a thing:
> https://sashiko.dev/#/patchset/20260522011908.1669332-1-shakeel.butt@xxxxxxxxx
>
> Did you already see this? The footers there indicate that an email was
> sent out but I don't know if it works?
Yes, I saw that comment. It is kind of very specific to archs with 256KiB base
page sizes. Anyways, I have a simple fix for that and there were minor
suggestions from others for simple changes, I will send v3 with the requested
changes.