Re: [linus:master] [mm] 01b9da291c: stress-ng.switch.ops_per_sec 67.7% regression

From: Shakeel Butt

Date: Wed May 13 2026 - 10:11:40 EST


On Wed, May 13, 2026 at 10:10:34AM +0800, Qi Zheng wrote:
>
>
> On 5/13/26 12:03 AM, Shakeel Butt wrote:
> > On Tue, May 12, 2026 at 08:56:52PM +0800, kernel test robot wrote:
> > >
> > >
> > > Hello,
> > >
> > > kernel test robot noticed a 67.7% regression of stress-ng.switch.ops_per_sec on:
> > >
> > >
> > > commit: 01b9da291c4969354807b52956f4aae1f41b4924 ("mm: memcontrol: convert objcg to be per-memcg per-node type")
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
> > This is most probably due to shuffling of struct mem_cgroup and struct
> > mem_cgroup_per_node members.
>
> Another possibility is that after objcg was split into per-node, the
> slab accounting fast path is still designed assuming only one current
> objcg per CPU:
>
> struct obj_stock_pcp {
> struct obj_cgroup *cached_objcg;
> };
>
> So it's may cause the following thrashing:
>
> CPU stock cached = memcg/node0 objcg
> free object tagged = memcg/node1 objcg
> => __refill_obj_stock --> objcg mismatch
> => drain_obj_stock()
> => cache switches to node1 objcg
>
> next local allocation tagged = node0 objcg
> => mismatch again
> => drain_obj_stock()

Actually I think this is the issue, we have ping pong threads running on
different nodes where though theu are in same cgroup but their current->obcg is
for local node and thus this ping pong is thrashing the per-cpu objcg stock.

The easier fix would be to compare objcg->memcg instead of just objcg during
draining and caching. In addition we can add support for multiple objcg per-cpu
stock caching.