Re: [PATCH for-next v3 7/9] mm/slab: introduce kfree_rcu_nolock()
From: hu.shengming
Date: Thu Jun 25 2026 - 21:54:20 EST
Harry wrote:
> On 6/24/26 6:22 PM, hu.shengming@xxxxxxxxxx wrote:
> > Harry wrote:
> >> Currently, k[v]free_rcu() cannot be called in unknown context since
> >> it could lead to a deadlock when called in the middle of k[v]free_rcu().
> >>
> >> Make users' lives easier by introducing kfree_rcu_nolock() variant,
> >> now that kfree_rcu_sheaf() is available on PREEMPT_RT and
> >> __kfree_rcu_sheaf() handles unknown context.
> >>
> >> Unlike k[v]free_rcu(), kfree_rcu_nolock() does not fall back to
> >> the kvfree_rcu batching when the sheaves path fails, and falls back to
> >> defer_kfree_rcu() instead. In most cases, the sheaves path is expected
> >> to succeed and it's unnecessary to add complexity to the existing
> >> kvfree_rcu batching.
> >>
> >> Since defer_kfree_rcu() can be called on caches without sheaves, move
> >> deferred_work_barrier() and rcu_barrier() outside the branch in
> >> kvfree_rcu_barrier_on_cache().
> >>
> >> Signed-off-by: Harry Yoo (Oracle) <harry@xxxxxxxxxx>
> >
> > Hi Harry,
> >
> > Thanks for the series. These patches fill a clear functional gap in the
> > existing free APIs by adding an RCU-deferred free interface for contexts
> > where kfree_rcu() cannot safely be used.
>
> Thanks for looking into this, Shengming.
Happy to review. :)
> >> ---
> >> include/linux/rcupdate.h | 12 ++++++++++++
> >> mm/slab.h | 1 +
> >> mm/slab_common.c | 22 ++++++++++++++++++++--
> >> mm/slub.c | 23 ++++++++++++++++++++++-
> >> 4 files changed, 55 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/mm/slab_common.c b/mm/slab_common.c
> >> index 807924a94fb0..5a39e6225160 100644
> >> --- a/mm/slab_common.c
> >> +++ b/mm/slab_common.c
> >> @@ -1263,6 +1263,23 @@ EXPORT_TRACEPOINT_SYMBOL(kmem_cache_alloc);
> >> EXPORT_TRACEPOINT_SYMBOL(kfree);
> >> EXPORT_TRACEPOINT_SYMBOL(kmem_cache_free);
> >>
> >> +void kfree_call_rcu_nolock(struct rcu_head *head, void *ptr)
> >> +{
> >> + struct slab *slab;
> >> + struct kmem_cache *s;
> >> +
> >> + VM_WARN_ON_ONCE(is_vmalloc_addr(ptr) || !virt_to_slab(ptr));
> >> +
> >> + slab = virt_to_slab(ptr);
> >> + s = slab->slab_cache;
> >> +
> >> + if (__kfree_rcu_sheaf(s, ptr, /* allow_spin = */ false))
> >> + return;
> >> +
> >
> > One consistency issue to address here: kfree_rcu_sheaf() only calls
> > __kfree_rcu_sheaf() for objects belonging to the local NUMA node. This
> > avoids filling a CPU's per-CPU sheaves with objects from remote slabs.
> >
> > kfree_call_rcu_nolock() currently skips that check and may therefore
> > place remote-node objects into the local CPU's RCU sheaf.
>
> That was intentional, but actually, this is a good point. Thanks.
>
> > Could you add the same local-node check used by kfree_rcu_sheaf()
> > before calling __kfree_rcu_sheaf(), and route remote-node objects
> > directly to the defer_kfree_rcu() fallback path instead?
>
> Falling back to defer_kfree_rcu() in v3 didn't make much sense
> as the object is inserted to a global list which would cause more
> troubles than NUMA miss.
>
Thanks for the clarification.
That makes sense. With the current fallback implementation, routing
remote-node objects directly to defer_kfree_rcu() would put them on the
global deferred list, which could be worse than keeping them in the local
CPU's sheaf despite the NUMA miss.
> But once we make the fallback path percpu, your suggestion would make
> more sense.
>
Fair point. That said, converting the fallback path to per-CPU seems to
diverge from what was implemented in patch 8 of the v3 series.
--
With Best Regards,
Shengming