Memory barrier needed with wake_up_process()?
From: Alan Stern
Date: Fri Sep 02 2016 - 14:10:19 EST
Paul, Peter, and Ingo:
This must have come up before, but I don't know what was decided.
Isn't it often true that a memory barrier is needed before a call to
wake_up_process()? A typical scenario might look like this:
CPU 0
-----
for (;;) {
set_current_state(TASK_INTERRUPTIBLE);
if (signal_pending(current))
break;
if (wakeup_flag)
break;
schedule();
}
__set_current_state(TASK_RUNNING);
wakeup_flag = 0;
CPU 1
-----
wakeup_flag = 1;
wake_up_process(my_task);
The underlying pattern is:
CPU 0 CPU 1
----- -----
write current->state write wakeup_flag
smp_mb();
read wakeup_flag read my_task->state
where set_current_state() does the write to current->state and
automatically adds the smp_mb(), and wake_up_process() reads
my_task->state to see whether the task needs to be woken up.
The kerneldoc for wake_up_process() says that it has no implied memory
barrier if it doesn't actually wake anything up. And even when it
does, the implied barrier is only smp_wmb, not smp_mb.
This is the so-called SB (Store Buffer) pattern, which is well known to
require a full smp_mb on both sides. Since wake_up_process() doesn't
include smp_mb(), isn't it correct that the caller must add it
explicitly?
In other words, shouldn't the code for CPU 1 really be:
wakeup_flag = 1;
smp_mb();
wake_up_process(task);
If my reasoning is correct, then why doesn't wake_up_process() include
this memory barrier automatically, the way set_current_state() does?
There could be an alternate version (__wake_up_process()) which omits
the barrier, just like __set_current_state().
Alan Stern