Re: [PATCH] RFC: console: hack up console_lock more v3

From: Peter Zijlstra
Date: Thu May 09 2019 - 09:38:38 EST

Next message: Jason Gunthorpe: "[GIT PULL] Please pull RDMA subsystem changes"
Previous message: Theodore Ts'o: "Re: [PATCH v2 00/17] kunit: introduce KUnit, the Linux kernel unit testing framework"
In reply to: Daniel Vetter: "Re: [PATCH] RFC: console: hack up console_lock more v3"
Next in thread: Petr Mladek: "Re: [PATCH] RFC: console: hack up console_lock more v3"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, May 09, 2019 at 03:06:09PM +0200, Daniel Vetter wrote:
> On Thu, May 9, 2019 at 2:31 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > On Thu, May 09, 2019 at 02:09:03PM +0200, Daniel Vetter wrote:
> > > Fix this by creating a prinkt_safe_up() which calls wake_up_process
> > > outside of the spinlock. This isn't correct in full generality, but
> > > good enough for console_lock:
> > >
> > > - console_lock doesn't use interruptible or killable or timeout down()
> > > calls, hence an up() is the only thing that can wake up a process.
> >
> > Wrong :/ Any task can be woken at any random time. We must, at all
> > times, assume spurious wakeups will happen.
>
> Out of curiosity, where do these come from? I know about the races
> where you need to recheck on the waiter side to avoid getting stuck,
> but didn't know about this. Are these earlier (possibly spurious)
> wakeups that got held up and delayed for a while, then hit the task
> much later when it's already continued doing something else?

Yes, this. So they all more or less have the form:

CPU0 CPU1

enqueue_waiter()
done = true;
if (waiters)
for (;;) {
if (done)
break;

...
}

dequeue_waiter()

do something else again

wake_up_task
<gets wakeup>

The wake_q thing made the above much more common, but we've had it
forever.

> Or even
> more random, and even if I never put a task on a wait list or anything
> else, ever, it can get woken spuriously?

I had patches that did that on purpose, but no.

> > Something like the below might work.
>
> Yeah that looks like the proper fix. I guess semaphores are uncritical
> enough that we can roll this out for everyone. Thanks for the hint.

It's actually an optimization that we never did because semaphores are
so uncritical :-)

The thing is, by delaying the wakup until after we've released the
spinlock, the waiter will not contend on the spinlock the moment it
wakes.

Next message: Jason Gunthorpe: "[GIT PULL] Please pull RDMA subsystem changes"
Previous message: Theodore Ts'o: "Re: [PATCH v2 00/17] kunit: introduce KUnit, the Linux kernel unit testing framework"
In reply to: Daniel Vetter: "Re: [PATCH] RFC: console: hack up console_lock more v3"
Next in thread: Petr Mladek: "Re: [PATCH] RFC: console: hack up console_lock more v3"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]