Re: [PATCH v4 10/10] IB/mlx5: Simplify completion into a wait_event
From: Linus Torvalds
Date: Thu Nov 03 2016 - 11:38:00 EST
On Wed, Oct 26, 2016 at 11:59 PM, Binoy Jayan <binoy.jayan@xxxxxxxxxx> wrote:
> Convert the completion 'mlx5_ib_umr_context:done' to a wait_event as it
> just waits for the return value to be filled.
This is wrong.
Since that "umr_context" variable is on the stack, and you are waiting
for it to be fully done, it really should be a completion.
Why?
Because a "complete()" guarantees that the *last* access to the
variable is done by the thread that does "wait_for_completion()".
In contrast, doing a "wait_event()" can actually *race* with whoever
wakes it up, and the "wait_event()" may race with the wakeup, where
the wakeup() ends up removing the entry from the list, so that
"list_empty_careful()" sees that it's all done, but the "wakeup()" can
still be holding (and about to release) the waitqueue spinlock.
What's the problem?
When the waiter is on the stack, the umr_context" variable may be
about to be relased, and then the "wake_up()" may end up accessing the
umr_context waitqueue spinlock after it has already become something
else.
This is unlikely, but it's very much one of the things that a
"completion" is all about. A completion (along with a plain spinlock)
is guaranteed to be synchronous in a way that semaphores and
wait-queues are *not*, so that when somebody has done "complete()",
there is no access to the variable that can race.
So no. A wait-queue wakeup is NOT AT ALL the same thing as a completion.
There really is a very real reason why "complete()" exists, and why it
is used for one-time initializations like this. "complete()" along
with spinlocks are synchronous in ways that other concepts are not.
Linus