Re: [Regression] mm:slab/sheaves: severe performance regression in cross-CPU slab allocation

From: Ming Lei

Date: Fri Feb 27 2026 - 04:25:35 EST


On Thu, Feb 26, 2026 at 07:02:11PM +0100, Vlastimil Babka (SUSE) wrote:
> On 2/25/26 10:31, Ming Lei wrote:
> > Hi Vlastimil,
> >
> > On Wed, Feb 25, 2026 at 09:45:03AM +0100, Vlastimil Babka (SUSE) wrote:
> >> On 2/24/26 21:27, Vlastimil Babka wrote:
> >> >
> >> > It made sense to me not to refill sheaves when we can't reclaim, but I
> >> > didn't anticipate this interaction with mempools. We could change them
> >> > but there might be others using a similar pattern. Maybe it would be for
> >> > the best to just drop that heuristic from __pcs_replace_empty_main()
> >> > (but carefully as some deadlock avoidance depends on it, we might need
> >> > to e.g. replace it with gfpflags_allow_spinning()). I'll send a patch
> >> > tomorrow to test this theory, unless someone beats me to it (feel free to).
> >> Could you try this then, please? Thanks!
> >
> > Thanks for working on this issue!
> >
> > Unfortunately the patch doesn't make a difference on IOPS in the perf test,
> > follows the collected perf profile on linus tree(basically 7.0-rc1 with your patch):
>
> what about this patch in addition to the previous one? Thanks.

With the two patches, IOPS increases to 22M from 13M, but still much less than
36M which is obtained in v6.19-rc5, and slab-sheave PR follows v6.19-rc5.

Also alloc_slowpath can't be observed any more.

Follows perf profile with the two patches:


- 8.30% 0.19% io_uring [kernel.kallsyms] [k] mempool_alloc_noprof
- 8.11% mempool_alloc_noprof
- 7.64% kmem_cache_alloc_noprof
- 6.15% __pcs_replace_empty_main
- 5.96% refill_sheaf
+ 5.95% refill_objects
+ 8.06% 0.44% io_uring [kernel.kallsyms] [k] kmem_cache_alloc_noprof
+ 7.44% 0.00% kublk [ublk_drv] [k] 0xffffffffc140c71b
+ 6.63% 0.03% kublk [kernel.kallsyms] [k] __io_run_local_work
+ 6.19% 0.05% io_uring [kernel.kallsyms] [k] __pcs_replace_empty_main
- 5.97% 0.01% io_uring [kernel.kallsyms] [k] refill_sheaf
- 5.96% refill_sheaf
- 5.95% refill_objects
- 4.87% __refill_objects_any
- 4.76% __refill_objects_node
0.72% __slab_free
- 1.00% allocate_slab
- 0.80% __alloc_frozen_pages_noprof
- 0.79% get_page_from_freelist
+ 0.72% post_alloc_hook
+ 5.96% 0.02% io_uring [kernel.kallsyms] [k] refill_objects


thanks,
Ming