[PATCH] kthread: Unpark only parked kthreads (was Re: [syzbot] [wireguard?] WARNING in kthread_unpark (2))

From: Frederic Weisbecker
Date: Wed Sep 11 2024 - 08:04:41 EST


Le Wed, Jul 31, 2024 at 04:29:02AM -0700, syzbot a écrit :
> Hello,
>
> syzbot has tested the proposed patch and the reproducer did not trigger any issue:
>
> Reported-by: syzbot+943d34fa3cf2191e3068@xxxxxxxxxxxxxxxxxxxxxxxxx
> Tested-by: syzbot+943d34fa3cf2191e3068@xxxxxxxxxxxxxxxxxxxxxxxxx
>
> Tested on:
>
> commit: dc1c8034 minmax: simplify min()/max()/clamp() implemen..
> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> console output: https://syzkaller.appspot.com/x/log.txt?x=1264b511980000
> kernel config: https://syzkaller.appspot.com/x/.config?x=2258b49cd9b339fa
> dashboard link: https://syzkaller.appspot.com/bug?extid=943d34fa3cf2191e3068
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> patch: https://syzkaller.appspot.com/x/patch.diff?x=10fe9911980000
>
> Note: testing is done by a robot and is best-effort only.
>

The problem is in the kthread code. kthread_stop() seem to assume that
the target is parked and since kthread_stop() is seldom called on per-cpu
kthreads (smpboot_unregister_percpu_thread() doesn't have any user yet), this
went unnoticed until workqueue happened to do it.

Can you test the following?
---
From: Frederic Weisbecker <frederic@xxxxxxxxxx>
Date: Tue, 10 Sep 2024 22:10:19 +0200
Subject: [PATCH] kthread: Unpark only parked kthreads

Calling into kthread unparking unconditionally is mostly harmless when
the kthread is already unparked. The wake up is then simply ignored
because the target is not in TASK_PARKED state.

However if the kthread is per CPU, the wake up is preceded by a call
to kthread_bind() which expects the task to be inactive and in
TASK_PARKED state, which obviously isn't the case if it is unparked.

As a result, calling kthread_stop() on an unparked per-cpu kthread
triggers such a warning:

WARNING: CPU: 0 PID: 11 at kernel/kthread.c:525 __kthread_bind_mask kernel/kthread.c:525
<TASK>
kthread_stop+0x17a/0x630 kernel/kthread.c:707
destroy_workqueue+0x136/0xc40 kernel/workqueue.c:5810
wg_destruct+0x1e2/0x2e0 drivers/net/wireguard/device.c:257
netdev_run_todo+0xe1a/0x1000 net/core/dev.c:10693
default_device_exit_batch+0xa14/0xa90 net/core/dev.c:11769
ops_exit_list net/core/net_namespace.c:178 [inline]
cleanup_net+0x89d/0xcc0 net/core/net_namespace.c:640
process_one_work kernel/workqueue.c:3231 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
worker_thread+0x86d/0xd70 kernel/workqueue.c:3393
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>

Fix this with skipping unecessary unparking while stopping a kthread.

Reported-by: syzbot+943d34fa3cf2191e3068@xxxxxxxxxxxxxxxxxxxxxxxxx
Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
---
kernel/kthread.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index f7be976ff88a..5e2ba556aba8 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -623,6 +623,8 @@ void kthread_unpark(struct task_struct *k)
{
struct kthread *kthread = to_kthread(k);

+ if (!test_bit(KTHREAD_SHOULD_PARK, &kthread->flags))
+ return;
/*
* Newly created kthread was parked when the CPU was offline.
* The binding was lost and we need to set it again.
--
2.46.0