Re: [RFC][PATCH] spin loop arch primitives for busy waiting

From: Linus Torvalds
Date: Thu Apr 06 2017 - 15:42:04 EST


On Thu, Apr 6, 2017 at 12:23 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> Something like so then. According to the SDM mwait is a no-op if we do
> not execute monitor first. So this variant should get the first
> iteration without expensive instructions.

No, the problem is that we *would* have executed a prior monitor that
could still be pending - from a previous invocation of
smp_cond_load_acquire().

Especially with spinlocks, these things can very much happen back-to-back.

And it would be pending with a different address (the previous
spinlock) that might not have changed since then (and might not be
changing), so now we might actually be pausing in mwait waiting for
that *other* thing to change.

So it would probably need to do something complicated like

#define smp_cond_load_acquire(ptr, cond_expr) \
({ \
typeof(ptr) __PTR = (ptr); \
typeof(*ptr) VAL; \
do { \
VAL = READ_ONCE(*__PTR); \
if (cond_expr) \
break; \
for (;;) { \
___monitor(__PTR, 0, 0); \
VAL = READ_ONCE(*__PTR); \
if (cond_expr) break; \
___mwait(0xf0 /* C0 */, 0); \
} \
} while (0) \
smp_acquire__after_ctrl_dep(); \
VAL; \
})

which might just generate nasty enough code to not be worth it.

I dunno

Linus