Re: [BUG] Memory ordering between kmalloc() and kfree()? it's confusing!

From: Alan Stern

Date: Thu Feb 26 2026 - 13:30:47 EST

On Fri, Feb 27, 2026 at 02:11:49AM +0900, Harry Yoo wrote:
> On Thu, Feb 26, 2026 at 11:42:02AM -0500, Alan Stern wrote:
> > On Fri, Feb 27, 2026 at 01:17:52AM +0900, Harry Yoo wrote:
> > > On Thu, Feb 26, 2026 at 10:45:55AM -0500, Alan Stern wrote:
> > > > On Thu, Feb 26, 2026 at 03:35:08PM +0900, Harry Yoo wrote:
> > > > > Because the slab allocator itself doesn't guarantee that such
> > > > > barriers are invoked within the allocator, it relies on users to
> > > > > do this when needed.
> > > >
> > > > It doesn't? Then how does the slab allocator guarantee that two
> > > > different CPUs won't try to perform allocations or deallocations from
> > > > the same slab at the same time, messing everything up?
> > >
> > > Ah, alloc/free slowpaths do use cmpxchg128 or spinlock and
> > > don't mess things up.
> > >
> > > But fastpath allocs/frees are served from percpu array that is protected
> > > by a local_lock. local_lock has a compiler barrier in it, but that's
> > > not enough.
> >
> > If those things rely on a percpu array, how can one CPU possibly
> > manipulate a resource (slab or something else) that was changed by a
> > different CPU?
>
> AFAICT that shouldn't happen within the slab allocator.
>
> > The whole point of percpu data structures is that each
> > CPU gets its own copy.
>
> Exactly.
>
> But I'm not talking about what happens within the allocator,
> but rather, about what slab expects to happen outside the allocator.

I understand.

> Something like this:
>
> CPU X CPU Y
> ptr = kmalloc();
> WRITE_ONCE(gp, ptr);
> if (p = READ_ONCE(gp))
> kfree(p);
>
> Yes, it's a crazy thing to do. CPU Y isn't guaranteed to see
> up-to-date version of object content or metadata.
>
> Instead, the code should do:
>
> CPU X CPU Y
> ptr = kmalloc();
> gp = smp_store_release(&gp, ptr);
>
> if (p = smp_load_acquire(&gp))
> kfree(p);
>
> One reason that I started this discussion was to argue that we should
> have a well-defined a contract between the slab allocator and its users.

Yes, you have made that quite clear. But you're missing _my_ point.

Which is: The same mechanism that the slab allocator uses to ensure that
CPU X and CPU Y won't step on each other's toes if they both run
kmalloc/kfree at the same time should also be able to guarantee that the
metadata changes made by CPU X will be visible to CPU Y if Y manipulates
a slab that X just finished with.

To put it another way, ensuring non-interference during simultaneous
accesses isn't all that different from ensuring coherence during
sequential accesses. Doing the first should easily allow doing the
second.

And if it doesn't then something questionable is going on.

Alan Stern