Re: [BUG] Memory ordering between kmalloc() and kfree()? it's confusing!

From: Alan Stern

Date: Thu Feb 26 2026 - 11:01:55 EST

On Thu, Feb 26, 2026 at 03:35:08PM +0900, Harry Yoo wrote:
> Hello, SLAB, LKMM, and KCSAN folks!
>
> I'd like to discuss slab's assumption on users regarding memory ordering.
>
> Recently, I've been investigating an interesting slab memory ordering
> issue [3] [4] in v7.0-rc1, which made me think about memory ordering
> for slab objects.
>
> But without answering "What does slab expect users to do for correct
> operation?", I kept getting puzzled, and my brain hurt too much :/
> I'm writing things down to stop getting confused :)
>
> Since I have never thought about this before, my reasoning could be
> partially or entirely incorrect. If so, please kindly let me know.
>
> # Slab's assumption: Stores to object, its metadata, or struct slab
> # must be visible to the CPU that frees the object, when it is
> # passed to kfree(). It's users' responsibility to guarantee that.
>
> When the slab allocator allocates an object, it updates its metadata and
> struct slab fields. After allocation, the user of slab updates object's
> content. As long as the object is freed on the same CPU that it was
> allocated, kfree() can see those stores (A CPU must be able to see
> what's in its store buffer), so no problem!
>
> However, when e.g.) the pointer to object is stored in a shared variable
> and then freed on a different CPU, things become trickier.
>
> In this case, I think it's fair for the slab allocator to assume that:
>
> 1) Such stores must involve _at least_ a release barrier
> (for example, via {cmp,}xchg{,_release}, or smp_store_release())
> to ensure preceding stores are visible to other CPUs before
> the pointer store becomes visible, and
>
> 2) The CPU that frees an object must invoke at least an acquire
> barrier to ensure that stores to object content / metadata, etc.,
> are visible to the freeing CPU when it calls kfree().
>
> Because the slab allocator itself doesn't guarantee that such
> barriers are invoked within the allocator, it relies on users to
> do this when needed.

It doesn't? Then how does the slab allocator guarantee that two
different CPUs won't try to perform allocations or deallocations from
the same slab at the same time, messing everything up?

Can you explain how this is meant to work, for those of us who don't
know anything about the slab allocator's internal mechanism?

Alan Stern