Re: Memory barrier needed with wake_up_process()?

From: Alan Stern
Date: Sat Sep 03 2016 - 10:27:14 EST


On Sat, 3 Sep 2016, Peter Zijlstra wrote:

> On Fri, Sep 02, 2016 at 04:29:19PM -0400, Alan Stern wrote:
> > I'm afraid so. The code doesn't use wait_event(), in part because
> > there's no wait_queue (since only one task is involved).
>
> You can use wait_queue fine with just one task, and it would clean up
> the code tremendously.
>
> You can replace things like the earlier mentioned:
>
> while (bh->state != BUF_STATE_EMPTY) {
> rc = sleep_thread(common, false);
> if (rc)
> return rc;
> }
>
> with:
>
> rc = wait_event_interruptible(&common->wq, bh->state == BUF_STATE_EMPTY);
> if (rc)
> return rc;

If someone wants to devote time and effort to cleaning up the driver,
that would be a good start.

> > But maybe there's another barrier which needs to be fixed. Felipe, can
> > you check to see if received_cbw() is getting called in
> > get_next_command(), and if so, what value it returns? Or is the
> > preceding sleep_thread() the one that never wakes up?
> >
> > It could be that the smp_wmb() in wakeup_thread() needs to be smp_mb().
> > The reason being that get_next_command() runs outside the protection of
> > the spinlock.
>
> Being somewhat confused by the code, I fail to follow that argument.
> wakeup_thread() is always called under that spinlock(), but since the
> critical section is 2 stores, I fail to see how a smp_mb() can make any
> difference over the smp_wmb() already there.

But sleep_thread() and the code that follows it are _not_ called under
the spinlock. And the following code examines values that were written
by DMA, not by the CPU calling wakeup_thread(). (Although that CPU
_is_ the one that receives the DMA-completion notice.)

In other words, we have:

CPU 0 CPU 1
----- -----
Start DMA Handle DMA-complete irq
Sleep until bh->state Set bh->state
smp_wmb()
Wake up CPU 0
smp_rmb()
Compute rc based on contents
of the DMA buffer

This was written many years ago, at a time when I did not fully
understand all the details of memory ordering. Do you agree that both
of those barriers should really be smp_mb()? That's what Felipe has
been testing.

Alan Stern