[PATCH v5] signal: break out of wait loops on kthread_stop()

From: Jason A. Donenfeld
Date: Mon Jul 11 2022 - 19:21:41 EST

I was recently surprised to learn that msleep_interruptible(),
wait_for_completion_interruptible_timeout(), and related functions
simply hung when I called kthread_stop() on kthreads using them. The
solution to fixing the case with msleep_interruptible() was more simply
to move to schedule_timeout_interruptible(). Why?

The reason is that msleep_interruptible(), and many functions just like
it, has a loop like this:

while (timeout && !signal_pending(current))
timeout = schedule_timeout_interruptible(timeout);

The call to kthread_stop() woke up the thread, so schedule_timeout_
interruptible() returned early, but because signal_pending() returned
true, it went back into another timeout, which was never woken up.

This wait loop pattern is common to various pieces of code, and I
suspect that the subtle misuse in a kthread that caused a deadlock in
the code I looked at last week is also found elsewhere.

So this commit causes signal_pending() to return true when
kthread_stop() is called, by setting TIF_NOTIFY_SIGNAL.

The same also probably applies to the similar kthread_park()
functionality, but that can be addressed later, as its semantics are
slightly different.

Cc: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
Signed-off-by: Jason A. Donenfeld <Jason@xxxxxxxxx>
Changes v4->v5:
- Use set_tsk_thread_flag instead of test_and_set_tsk_thread_flag. Must
have been a copy and paste mistarget.
Changes v3->v4:
- Don't address park() for now.
- Don't bother clearing the flag, since the task is about to be freed

kernel/kthread.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 3c677918d8f2..7243a010f433 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -704,6 +704,7 @@ int kthread_stop(struct task_struct *k)
kthread = to_kthread(k);
set_bit(KTHREAD_SHOULD_STOP, &kthread->flags);
+ set_tsk_thread_flag(k, TIF_NOTIFY_SIGNAL);
ret = kthread->result;