Re: Memory barrier needed with wake_up_process()?
From: Peter Zijlstra
Date: Tue Sep 06 2016 - 08:21:03 EST
On Tue, Sep 06, 2016 at 01:49:37PM +0200, Peter Zijlstra wrote:
> On Tue, Sep 06, 2016 at 02:43:39PM +0300, Felipe Balbi wrote:
> > My fear now, however, is that changing smp_[rw]mb() to smp_mb() just
> > adds extra overhead which makes the problem much, much less likely to
> > happen. Does that sound plausible to you?
>
> I did consider that, but I've not sufficiently grokked the code to rule
> out actual fail. So let me stare at this a bit more.
OK, so I'm really not seeing it, we've got:
while (bh->state != FULL) {
for (;;) {
set_current_state(INTERRUPTIBLE); /* MB after */
if (signal_pending(current))
return -EINTR;
if (common->thread_wakeup_needed)
break;
schedule(); /* MB */
}
__set_current_state(RUNNING);
common->thread_wakeup_needed = 0;
smp_rmb(); /* NOP */
}
VS.
spin_lock(&common->lock); /* MB */
bh->state = FULL;
smp_wmb(); /* NOP */
common->thread_wakeup_needed = 1;
wake_up_process(common->thread_task); /* MB before */
spin_unlock(&common->lock);
(the MB annotations specific to x86, not true in general)
If we observe thread_wakeup_needed, we must also observe bh->state.
And the sleep/wakeup ordering is also correct, we either see
thread_wakeup_needed and continue, or we see task->state == RUNNING
(from the wakeup) and NO-OP schedule(). The MB from set_current_statE()
then matches with the MB from wake_up_process() to ensure we must see
thead_wakeup_needed.
Or, we go sleep, and get woken up, at which point the same happens.
Since the waking CPU gets the task back on its RQ the happens-before
chain includes the waking CPUs state along with the state of the task
itself before it went to sleep.
At which point we're back where we started, once we see
thread_wakeup_needed we must then also see bh->state (and all state
prior to that on the waking CPU).
There's enough cruft in the while-sleep loop to force reload bh->state.
Load/store tearing cannot be a problem because all values are single
bytes (the variables are multi bytes, but all values used only affect
the LSB).
Colour me puzzled.