Re: Memory barrier needed with wake_up_process()?
From: Alan Stern
Date: Sat Sep 03 2016 - 10:17:24 EST
On Sat, 3 Sep 2016, Peter Zijlstra wrote:
> On Sat, Sep 03, 2016 at 09:58:09AM +0300, Felipe Balbi wrote:
>
> > > What arch are you seeing this on?
> >
> > x86. Skylake to be exact.
>
> So it _cannot_ be the thing Alan mentioned. By the simple fact that
> spin_lock() is a full barrier on x86 (every LOCK prefixed instruction
> is).
True, my guess was wrong.
> > The following change survived through the night:
> >
> > diff --git a/drivers/usb/gadget/function/f_mass_storage.c b/drivers/usb/gadget/function/f_mass_storage.c
> > index 8f3659b65f53..d31581dd5ce5 100644
> > --- a/drivers/usb/gadget/function/f_mass_storage.c
> > +++ b/drivers/usb/gadget/function/f_mass_storage.c
> > @@ -395,7 +395,7 @@ static int fsg_set_halt(struct fsg_dev *fsg, struct usb_ep *ep)
> > /* Caller must hold fsg->lock */
> > static void wakeup_thread(struct fsg_common *common)
> > {
> > - smp_wmb(); /* ensure the write of bh->state is complete */
> > + smp_mb(); /* ensure the write of bh->state is complete */
> > /* Tell the main thread that something has happened */
> > common->thread_wakeup_needed = 1;
> > if (common->thread_task)
> > @@ -626,7 +626,7 @@ static int sleep_thread(struct fsg_common *common, bool can_freeze)
> > }
> > __set_current_state(TASK_RUNNING);
> > common->thread_wakeup_needed = 0;
> > - smp_rmb(); /* ensure the latest bh->state is visible */
> > + smp_mb(); /* ensure the latest bh->state is visible */
> > return rc;
> > }
>
> Sorry, but that is horrible code. A barrier cannot ensure writes are
> 'complete', at best they can ensure order between writes (or reads
> etc..).
The code is better than the comment. What I really meant was that the
write of bh->state needs to be visible to the thread after it wakes up
(or after it checks the wakeup condition and skips going to sleep).
> Also, looking at that thing, that common->thread_wakeup_needed variable
> is 100% redundant. All sleep_thread() invocations are inside a loop of
> sorts and basically wait for other conditions to become true.
>
> For example:
>
> while (bh->state != BUF_STATE_EMPTY) {
> rc = sleep_thread(common, false);
> if (rc)
> return rc;
> }
>
> All you care about there is bh->state, _not_
> common->thread_wakeup_needed.
You know, I never went through and verified that _all_ the invocations
of sleep_thread() are like that. In fact, I wrote the sleep/wakeup
routines _before_ the rest of the code, and I didn't know in advance
exactly how they were going to be called.
> That said, I cannot spot an obvious fail, but the code can certainly use
> help.
The problem may be that when the thread wakes up (or skips going to
sleep), it needs to see more than just bh->state. Those other values
it needs are not written by the same CPU that calls wakeup_thread(),
and so to ensure that they are visible that smp_wmb() really ought to
be smp_mb() (and correspondingly in the thread. That's what Felipe has
been testing.
Alan Stern