Re: [PATCH 2/4] swait: add the missing killable swaits

From: Linus Torvalds
Date: Fri Jul 07 2017 - 18:48:33 EST


On Fri, Jul 7, 2017 at 3:27 PM, Davidlohr Bueso <dave@xxxxxxxxxxxx> wrote:
>
> Ok sorry, fwiw those were 80-line fixlets I thought were trivial enough
> to just fly by.

I find them annoying, because it makes it so much harder to see what
the patch actually does.

In this case, I think that more than 50% of the patch was just
whitespace changes..

> Oh indeed, this was always my intent. Going back to the patch, when
> checking DEFINE_WAIT_FUNC I clearly overlooked the ->func()
> implications, breaking all kinds of semantics. With that and the
> constraints aforementioned in the patch, I see no sane way of using
> wake_qs.

Well, very few people actually use "wake_up_all()", particularly for
any of the things that use special wake functions.

So it probably works in practice.

And then somebody starts using pollfd or something on one the things
that *do* use wake_up_all() and happens to also allow polling (or
whatever), and you get nasty crashes.

> Given that you seem to agree that the lockless version is possible as
> long as we keep semantics, this imho is another point for some form of
> simplified waitqueues.

We just really haven't had a lot of problems with the waitqueues in my
experience. Many of the historical big problems were about the whole
"exclusive vs non-exclusive" thundering herd problems, which is
actually the most complex thing about them (the callback function adds
a pointer to the wait queue, so makes it bigger, but that is very
seldom a huge issue).

Most of the things that want specific wakeups tend to be some really
low-level stuff (ie semaphores etc - both the sysvipc kind of ones and
the kernel locking kind of ones).

They are often doing their very special own things anyway. And often
the regular waitqueues actually work fine, and the biggest thing is to
use the lock within the waitqueue for the object that is being waited
on too, so that you just avoid the double locking.

So you may have hit the one or two cases where the usual wait-queues
didn't work well, but in *most* cases they work wonderfully.

Linus