Re: [PATCH slab/for-next-fixes] mm/slab: allow sheaf refill if blocking is not allowed

From: Vlastimil Babka

Date: Wed Mar 04 2026 - 05:01:31 EST

On 3/4/26 4:05 AM, Harry Yoo wrote:
> On Mon, Mar 02, 2026 at 10:55:37AM +0100, Vlastimil Babka (SUSE) wrote:
>> Ming Lei reported [1] a regression in the ublk null target benchmark due
>> to sheaves. The profile shows that the alloc_from_pcs() fastpath fails
>> and allocations fall back to ___slab_alloc(). It also shows the
>> allocations happen through mempool_alloc().
>>
>> The strategy of mempool_alloc() is to call the underlying allocator
>> (here slab) without __GFP_DIRECT_RECLAIM first. This does not play well
>> with __pcs_replace_empty_main() checking for gfpflags_allow_blocking()
>> to decide if it should refill an empty sheaf or fallback to the
>> slowpath, so we end up falling back.
>>
>> We could change the mempool strategy but there might be other paths
>> doing the same ting. So instead allow sheaf refill when blocking is not
>> allowed, changing the condition to gfpflags_allow_spinning(). The
>> original condition was unnecessarily restrictive.
>>
>> Note this doesn't fully resolve the regression [1] as another component
>> of that are memoryless nodes, which is to be addressed separately.
>>
>> Reported-by: Ming Lei <ming.lei@xxxxxxxxxx>
>> Fixes: e47c897a2949 ("slab: add sheaves to most caches")
>> Link: https://lore.kernel.org/all/aZ0SbIqaIkwoW2mB@fedora/
>> Signed-off-by: Vlastimil Babka (SUSE) <vbabka@xxxxxxxxxx>
>> ---
>> mm/slub.c | 21 +++++++++------------
>> 1 file changed, 9 insertions(+), 12 deletions(-)
>>
>> diff --git a/mm/slub.c b/mm/slub.c
>> index b1e9f16ba435..17b200695e9b 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -4632,11 +4631,8 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
>> if (!full)
>> return NULL;
>>
>> - /*
>> - * we can reach here only when gfpflags_allow_blocking
>> - * so this must not be an irq
>> - */
>> - local_lock(&s->cpu_sheaves->lock);
>> + if (!local_trylock(&s->cpu_sheaves->lock))
>> + goto barn_put;
>
> My AI buddy says (don't worry, I filtered it):
> | When local_trylock() fails above, the function jumps to barn_put and returns
> | pcs without holding the lock. This appears to violate the function's contract
> | documented in the comment at the beginning of __pcs_replace_empty_main():
> |
> | "If not successful, returns NULL and the local lock unlocked."
> |
> | The caller in alloc_from_pcs() checks for NULL to detect failure:
> |
> | if (unlikely(pcs->main->size == 0)) {
> | pcs = __pcs_replace_empty_main(s, pcs, gfp);
> | if (unlikely(!pcs))
> | return NULL;
> | }
> |
> | If the trylock fails and pcs (non-NULL) is returned, the caller proceeds
> | without realizing the lock was never re-acquired. This leads to accessing
> | pcs->main without the lock and later trying to unlock a lock that isn't held.
>
> And the analysis sounds correct to me.
>
> perhaps it should be:
>
> if (!local_trylock(&s->cpu_sheaves->lock)) {
> pcs = NULL;
> goto barn_put;
> }

Thanks a lot Harry. In fact I realized this mistake after initially
sending the patch to Ming in a reply, and fixed it locally (same as you
suggest).
Or so I thought, because the fix got apparently lost.
So I'll do that now in slab/for-next-fixes

Or actually I think a more robust way is to set pcs = NULL after the
unlock, unconditionally, so I'll do that.

>> pcs = this_cpu_ptr(s->cpu_sheaves);
>>
>> /*
>> @@ -4667,6 +4663,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
>> return pcs;
>> }
>>
>> +barn_put:
>> barn_put_full_sheaf(barn, full);
>> stat(s, BARN_PUT);
>