Re: [PATCH] slab: prevent infinite loop in kmalloc_nolock() with debugging

From: Dev Jain

Date: Tue Nov 04 2025 - 05:36:07 EST



On 04/11/25 3:54 pm, Vlastimil Babka wrote:
On 11/4/25 6:26 AM, Dev Jain wrote:
On 03/11/25 5:54 pm, Vlastimil Babka wrote:
In review of a followup work, Harry noticed a potential infinite loop.
Upon closed inspection, it already exists for kmalloc_nolock() on a
cache with debugging enabled, since commit af92793e52c3 ("slab:
Introduce kmalloc_nolock() and kfree_nolock().")

When alloc_single_from_new_slab() fails to trylock node list_lock, we
keep retrying to get partial slab or allocate a new slab. If we indeed
interrupted somebody holding the list_lock, the trylock fill fail
Hi Vlastimil,

I see that we always take n->list_lock spinlock by disabling irqs. So
how can we interrupt someone holding the list_lock?
From a NMI or e.g. a kprobe->bpf hook, which are the use cases for
kmalloc_nolock(). The word "interrupt" thus doesn't mean IRQ, but I'm
not sure which word would be better. "Preempt" would be perhaps even
more potentially misleading.

If we are already in a path holding list_lock, and trigger a slab
allocation
and recursively end up in the same path again, we can get the situation
you mention, is that possible?
There shouldn't be such recursion in the code itself, in the absence of
NMI/kprobe/etc.

Thanks for explaining.

deterministically and we end up allocating and defer-freeing slabs
indefinitely with no progress.

To fix it, fail the allocation if spinning is not allowed. This is
acceptable in the restricted context of kmalloc_nolock(), especially
with debugging enabled.

Reported-by: Harry Yoo <harry.yoo@xxxxxxxxxx>
Closes: https://lore.kernel.org/all/aQLqZjjq1SPD3Fml@hyeyoo/
Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and
kfree_nolock().")
Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
---
as we discussed in the linked thread, 6.18 hotfix to be included in
slab/for-next-fixes
---
  mm/slub.c | 6 +++++-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/slub.c b/mm/slub.c
index d4367f25b20d..f1a5373eee7b 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4666,8 +4666,12 @@ static void *___slab_alloc(struct kmem_cache
*s, gfp_t gfpflags, int node,
      if (kmem_cache_debug(s)) {
          freelist = alloc_single_from_new_slab(s, slab, orig_size,
gfpflags);
  -        if (unlikely(!freelist))
+        if (unlikely(!freelist)) {
+            /* This could cause an endless loop. Fail instead. */
+            if (!allow_spin)
+                return NULL;
              goto new_objects;
+        }
            if (s->flags & SLAB_STORE_USER)
              set_track(s, freelist, TRACK_ALLOC, addr,

---
base-commit: 6146a0f1dfae5d37442a9ddcba012add260bceb0
change-id: 20251103-fix-nolock-loop-854e0101672f

Best regards,