Re: [next-20260216]NULL pointer dereference in drain_obj_stock() (RCU free path)

From: Harry Yoo

Date: Mon Feb 23 2026 - 21:08:07 EST

On Mon, Feb 23, 2026 at 11:36:11AM +0900, Harry Yoo wrote:
> On Sun, Feb 22, 2026 at 03:48:53PM -0800, Shakeel Butt wrote:
> > On Sun, Feb 22, 2026 at 03:36:46PM -0800, Shakeel Butt wrote:
> > I asked AI to debug this crash report along with a nudge towards to look for
> > stride corruption, it gave me the following output:

[...snip...]

> > ## CRITICAL: Memory Ordering Bug on PowerPC (Likely Root Cause)
> >
> > ### The Problem
> >
> > In `alloc_slab_obj_exts` (mm/slub.c lines 2199-2220), there is **NO memory barrier**
> > between the stride store and the obj_exts visibility via cmpxchg:
>
> This is actually a good point.
>
> > ```c
> > slab_set_stride(slab, sizeof(struct slabobj_ext)); // Store to stride (line 2199)
> > // NO MEMORY BARRIER HERE!
> > if (new_slab) {
> > slab->obj_exts = new_exts; // Store to obj_exts (line 2207)
> > } else if (...) {
> > } else if (cmpxchg(&slab->obj_exts, ...) != ...) { // Atomic on obj_exts (line 2220)
> > goto retry;
> > }
> > ```
> >
> > ### Why This Crashes on PowerPC
> >
> > PowerPC has a **weakly-ordered memory model**. Stores can be reordered and may not be
> > immediately visible to other processors. The cmpxchg provides a barrier AFTER it
> > executes, but the stride store BEFORE cmpxchg may not be visible when obj_exts becomes
> > visible.

I want to clarify one thing. The AI output is slightly incorrect;
cmpxchg() implies a full memory barrier when it succeeds and
(as it's a RMW operation that has a return value and is conditional)
stores cannot be reordered across a full memory barrier.

The reason why the ordering is not enforced is because read-side has no
barriers and the compiler or the CPU could reorder loads and read
slab->stride before slab->obj_exts.

> > **Race Scenario:**
> > 1. CPU A: `slab_set_stride(slab, 16)` (store to stride, in CPU A's store buffer)
> > 2. CPU A: `cmpxchg(&slab->obj_exts, 0, new_exts)` succeeds, obj_exts is now visible
> > 3. CPU B: Sees `obj_exts` is set (from step 2)
> > 4. CPU B: Reads `slab->stride` → **sees OLD value (0 or garbage)** due to reordering!
> > 5. CPU B: `slab_obj_ext` calculates `obj_exts + 0 * index = obj_exts` for ALL indices!
> > 6. **All objects appear to share the same obj_ext at offset 0**
>
> Yes, that could actually happen, especially when the cache doesn't
> specify SLAB_ACCOUNT but allocate objects with __GFP_ACCOUNT set
> (e.g. xarray does that).
>
> With sheaves for all, objects can be in different CPUs' sheaves and they
> could try to allocate obj_exts and charge objects from the same slab.

--
Cheers,
Harry / Hyeonggon