Re: [PATCH] locking/osq_lock: Use READ_ONCE() for node->prev

From: David Laight

Date: Mon May 18 2026 - 10:39:24 EST

On Mon, 18 May 2026 11:29:53 +0100
Will Deacon <will@xxxxxxxxxx> wrote:

> On Mon, Mar 30, 2026 at 09:32:55AM +0800, Yu Peng wrote:
> > osq_lock() consults node->prev in the vcpu_is_preempted() heuristic while
> > a concurrent predecessor may update it via WRITE_ONCE(next->prev, prev)
> > during unqueue.
> >
> > This read only affects the decision to abort optimistic spinning; stale
> > values do not affect queue linkage or lock correctness. Use READ_ONCE()
> > to mark the shared read and match the concurrent WRITE_ONCE() update.
> >
> > No functional change intended.
> >
> > Signed-off-by: Yu Peng <pengyu@xxxxxxxxxx>
> > ---
> > kernel/locking/osq_lock.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> > index b4233dc2c2b04..db4545e7bb72c 100644
> > --- a/kernel/locking/osq_lock.c
> > +++ b/kernel/locking/osq_lock.c
> > @@ -144,7 +144,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> > * polling, be careful.
> > */
> > if (smp_cond_load_relaxed(&node->locked, VAL || need_resched() ||
> > - vcpu_is_preempted(node_cpu(node->prev))))
> > + vcpu_is_preempted(node_cpu(READ_ONCE(node->prev)))))
> > return true;
>
> Hmm, I wonder whether this is actually sufficient...
>
> Architectures with relaxed memory models won't order plain reads to
> different addresses, so the read of 'node->locked' is unordered wrt the
> read of 'node->prev' in this condition. Given that we're using
> smp_cond_load_relaxed(), can we end up using a value of 'node->prev'
> that was loaded in a previous iteration of the loop?
>
> I'd be much more comfortable if this was smp_cond_load_acquire(), in
> addition to the READ_ONCE() that you are proposing.

I've got a patch 'pending' for this file that tidied some things up.
In particular it saves the cpu numbers not the per-cpu addresses.
So the above check doesn't bounce a cache line.
Perhaps it is time to resend it.

Note that the vpu_is_preempted() path is horribly expensive and
can't actually work reliably.
And can't work at all on arm where 'mwait' (equivalent) is used.

-- David

>
> Will
>