Re: [Regression] mm:slab/sheaves: severe performance regression in cross-CPU slab allocation

From: Ming Lei

Date: Wed Feb 25 2026 - 07:30:51 EST


On Wed, Feb 25, 2026 at 12:29:26PM +0100, Vlastimil Babka (SUSE) wrote:
> On 2/25/26 10:31, Ming Lei wrote:
> > Hi Vlastimil,
> >
> > On Wed, Feb 25, 2026 at 09:45:03AM +0100, Vlastimil Babka (SUSE) wrote:
> >> On 2/24/26 21:27, Vlastimil Babka wrote:
> >> >
> >> > It made sense to me not to refill sheaves when we can't reclaim, but I
> >> > didn't anticipate this interaction with mempools. We could change them
> >> > but there might be others using a similar pattern. Maybe it would be for
> >> > the best to just drop that heuristic from __pcs_replace_empty_main()
> >> > (but carefully as some deadlock avoidance depends on it, we might need
> >> > to e.g. replace it with gfpflags_allow_spinning()). I'll send a patch
> >> > tomorrow to test this theory, unless someone beats me to it (feel free to).
> >> Could you try this then, please? Thanks!
> >
> > Thanks for working on this issue!
> >
> > Unfortunately the patch doesn't make a difference on IOPS in the perf test,
> > follows the collected perf profile on linus tree(basically 7.0-rc1 with your patch):
>
> Hm that's weird, still the slowpath is prominent in your profile.
>
> I followed your reproducer instructions, although only with a small
> virtme-ng based setup. What's the output of "numactl -H" on yours, btw?

available: 8 nodes (0-7)
node 0 cpus: 0 1 2 3 32 33 34 35
node 0 size: 0 MB
node 0 free: 0 MB
node 1 cpus: 4 5 6 7 36 37 38 39
node 1 size: 31906 MB
node 1 free: 30572 MB
node 2 cpus: 8 9 10 11 40 41 42 43
node 2 size: 0 MB
node 2 free: 0 MB
node 3 cpus: 12 13 14 15 44 45 46 47
node 3 size: 0 MB
node 3 free: 0 MB
node 4 cpus: 16 17 18 19 48 49 50 51
node 4 size: 0 MB
node 4 free: 0 MB
node 5 cpus: 20 21 22 23 52 53 54 55
node 5 size: 32135 MB
node 5 free: 31086 MB
node 6 cpus: 24 25 26 27 56 57 58 59
node 6 size: 0 MB
node 6 free: 0 MB
node 7 cpus: 28 29 30 31 60 61 62 63
node 7 size: 0 MB
node 7 free: 0 MB
node distances:
node 0 1 2 3 4 5 6 7
0: 10 12 12 12 32 32 32 32
1: 12 10 12 12 32 32 32 32
2: 12 12 10 12 32 32 32 32
3: 12 12 12 10 32 32 32 32
4: 32 32 32 32 10 12 12 12
5: 32 32 32 32 12 10 12 12
6: 32 32 32 32 12 12 10 12
7: 32 32 32 32 12 12 12 10

>
> Anyway what I saw is my patch raised the IOPS substantially, and with
> CONFIG_SLUB_STATS=y enabled I could see that
> /sys/kernel/slab/bio-248/alloc_slowpath had substantial values before the
> patch and zero afterwards.
>
> Maybe if you could also enable CONFIG_SLUB_STATS=y and see in which cache(s)
> there's significant alloc_slowpath even after the patch, it could help.

Patched:

/sys/kernel/slab/bio-264
./alloc_slowpath:83555260 C0=33 C1=6717992 C2=9 C3=6611030 C8=128 C9=6802316 C11=6934363 C13=6721479 C14=66 C15=6694472 C16=96 C17=7286868 C18=128 C19=7369091 C24=128 C25=7288673 C26=51 C27=6800502 C28=129 C29=7095073 C31=7232628 C43=4 C56=1

Also config.tar.gz is attached.

Thanks,
Ming

Attachment: config.tar.gz
Description: application/gzip