[PATCH RFC 4/4] mm/slab: serialize defer_free_barrier()

From: Harry Yoo (Oracle)

Date: Wed Jun 24 2026 - 09:12:28 EST


irq_work_sync() uses rcuwait instead of busy waiting in two cases:

1. The kernel is using PREEMPT_RT and the irq work does not run in a
hardirq context.

2. The architecture cannot send inter-processor interrupts to make
busy waiting reasonably short.

However, rcuwait.h says:
> The caller is responsible for locking around rcuwait_wait_event(),
> and [prepare_to/finish]_rcuwait() such that writes to @task are
> properly serialized.

Since defer_free_barrier() calls irq_work_sync() without any locks,
it can potentially cause a hang as writes to @task are not serialized.

Fix this by calling defer_free_barrier() under slab_mutex and
cpus_read_lock() and add lockdep asserts.

Now that defer_free_barrier() is called inside cpus_read_lock(), iterate
over online cpus instead of possible cpus.

Reported-by: Sashiko <sashiko+bot@xxxxxxxxxx>
Closes: https://sashiko.dev/#/patchset/20260615-kfree_rcu_nolock-v3-0-70a54f3775bb%40kernel.org?part=5
Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Harry Yoo (Oracle) <harry@xxxxxxxxxx>
---
mm/slab_common.c | 5 ++---
mm/slub.c | 6 +++++-
2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 388eb5980859..27f77273fabe 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -550,11 +550,10 @@ void kmem_cache_destroy(struct kmem_cache *s)
rcu_barrier();
}

- /* Wait for deferred work from kmalloc/kfree_nolock() */
- defer_free_barrier();
-
cpus_read_lock();
mutex_lock(&slab_mutex);
+ /* Wait for deferred work from kmalloc/kfree_nolock() */
+ defer_free_barrier();

s->refcount--;
if (s->refcount) {
diff --git a/mm/slub.c b/mm/slub.c
index 4a3618e3967e..52c8d3f33782 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -6411,7 +6411,11 @@ void defer_free_barrier(void)
{
int cpu;

- for_each_possible_cpu(cpu)
+ /* irq_work_sync() may use rcuwait that requires serialization */
+ lockdep_assert_held(&slab_mutex);
+ lockdep_assert_cpus_held();
+
+ for_each_online_cpu(cpu)
irq_work_sync(&per_cpu_ptr(&defer_free_objects, cpu)->work);
}


--
2.53.0