Re: [PATCH 3/4] locking: Introduce smp_cond_acquire()

From: Will Deacon
Date: Mon Dec 07 2015 - 10:18:44 EST


On Sat, Dec 05, 2015 at 12:43:37AM +0100, Peter Zijlstra wrote:
> On Fri, Dec 04, 2015 at 02:05:49PM -0800, Linus Torvalds wrote:
> > Of course, I suspect we should not use READ_ONCE(), but some
> > architecture-overridable version that just defaults to READ_ONCE().
> > Same goes for that "smp_rmb()". Because maybe some architectures will
> > just prefer an explicit acquire, and I suspect we do *not* want
> > architectures having to recreate and override that crazy loop.
> >
> > How much does this all actually end up mattering, btw?
>
> Not sure, I'll have to let Will quantify that. But the whole reason
> we're having this discussion is that ARM64 has a MONITOR+MWAIT like
> construct that they'd like to use to avoid the spinning.
>
> Of course, in order to use that, they _have_ to override the crazy loop.

Right. This also removes one of the few hurdles standing between us (arm64)
and generic locking routines such as qspinlock, where we really don't
want busy-wait loops (since cpu_relax doesn't give us the opportuinity
to use wfe safely).

> Now, Will and I spoke earlier today, and the version proposed by me (and
> you, since that is roughly similar) will indeed work for them in that it
> would allow them to rewrite the thing something like:
>
>
> typeof(*ptr) VAL;
> for (;;) {
> VAL = READ_ONCE(*ptr);
> if (expr)
> break;
> cmp_and_wait(ptr, VAL);
> }
>
>
> Where their cmd_and_wait(ptr, val) looks a little like:
>
> asm volatile(
> " ldxr %w0, %1 \n"
> " sub %w0, %w0, %2 \n"
> " cbnz 1f \n"
> " wfe \n"
> "1:"
>
> : "=&r" (tmp)
> : "Q" (*ptr), "r" (val)
> );
>
> (excuse my poor ARM asm foo)
>
> Which sets up a load-exclusive monitor, compares if the value loaded
> matches what we previously saw, and if so, wait-for-event.
>
> WFE will wake on any event that would've also invalidated a subsequent
> stxr or store-exclusive.
>
> ARM64 also of course can choose to use load-acquire instead of the
> READ_ONCE(), or still issue the smp_rmb(), dunno what is best for them.
> The load-acquire would (potentially) be issued multiple times, vs the
> rmb only once. I'll let Will sort that.

Yup. I'll just override the whole thing using something much like what
you're suggesting.

Cheers,

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/