Re: re. Spurious wakeup on a newly created kthread

From: Tejun Heo
Date: Sun Jun 26 2022 - 02:09:32 EST


Hello,

On Sat, Jun 25, 2022 at 07:53:34PM -0700, Linus Torvalds wrote:
...
> Is that the _common_ pattern? It's not *uncommon*. It's maybe not the
> strictly normal wait-queue one, but if you grep for
> "wake_up_process()" you will find quite a lot of them.
...
> > I'm probably missing sometihng. Is it about bespoke wait mechanisms?
> > Can you give a concrete example of an async wakeup scenario?
>
> Grep for wake_up_process() and just look for them/

Right, for kthreads, custom work list + directed wakeup at the known
handler task is a pattern seen across the code base and that does
affect all other waits the kthread would do.

...
> So you need to always do that in a loop. The wait_event code will do
> that loop for you, but if you do manual wait-queues you are required
> to do the looping yourself.

The reason why this bothered me sometimes is that w/ simple kthread
use cases, there are place where all the legtimate wakeup sources are
clearly known. In those cases, the fact that I can't think of a case
where the looping would be needed creates subtle nagging feeling when
writing them.

...
> Again, none of these are *really* spurious. They are real wakeup
> events. It's just that within the *local* code they look spurious,
> because the locking code, the disk IO code, whatever the code is
> doesn't know or care about all the other things that process is
> involved in.

I see. Yeah, waiting on multiple sources which may not be known to
each wait logic makes sense.

...
> But I don't think that's what's going on here. I think the workqueue
> code is just confused, and should have initielized "worker->pool" much
> earlier. Because as things are now, when worker_thread() starts
> running, and does that
>
> static int worker_thread(void *__worker)
> {
> struct worker *worker = __worker;
> struct worker_pool *pool = worker->pool;
>
> thing, that can happen *immediately* after that
>
> kthread_create_on_node(worker_thread, worker,
>
> happens. It just so happens that *normally* the create_worker() code
> ends up finishing setup before the new worker has actually finished
> scheduling..
>
> No?

I'm not sure. Putting aside the always-loop-with-cond-check princicple
for the time being, I can't yet think of a scenario where the
interlocking would break down unless there really is a wakeup which is
coming from an unrelated source.

Just experimented with commenting out that wakeup in create_worker().
Simply commenting it out doesn't break anything but if I wait for
PF_WQ_WORKER to be set by the new worker thread, it doesn't happen.
ie. the initial wakeup is spurious because there is a later wakeup
which comes when a work item is actually dispatched to the new worker.
But the newly created kworker won't start executing its callback
function without at least one extra wakeup, whereever that may be
coming from.

After the initial TASK_NEW handshake, the new kthread notifies
kthread_create_info->done and then goes into an UNINTERRUPTIBLE sleep
and won't start executing the callback function unless somebody wakes
it up. This being a brand new task, it's a pretty sterile wakeup
environment. So, there actually has to be an outside wakeup source
here.

If we say that anyone should expect external wakeups that it has no
direct control over, the kthread_bind interface is broken too cuz
that's making exactly the same assumption that the task is dormant at
that point till the owner sends a wakeup and thus its internal state
can be manipulated safely.

Petr, regardless of how this eventually gets resolved, I'm really
curious where the wakeup that you saw came from. Would it be possible
for you to find out?

Thanks.

--
tejun