Re: [PATCH] kthread: Unpark only parked kthread
From: Andrew Morton
Date: Wed Oct 02 2024 - 17:08:35 EST
On Wed, 2 Oct 2024 16:15:05 +0200 Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:
> Le Thu, Sep 26, 2024 at 01:21:30PM -0700, Andrew Morton a écrit :
> > On Fri, 13 Sep 2024 23:46:34 +0200 Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:
> >
> > > Calling into kthread unparking unconditionally is mostly harmless when
> > > the kthread is already unparked. The wake up is then simply ignored
> > > because the target is not in TASK_PARKED state.
> > >
> > > However if the kthread is per CPU, the wake up is preceded by a call
> > > to kthread_bind() which expects the task to be inactive and in
> > > TASK_PARKED state, which obviously isn't the case if it is unparked.
> > >
> > > As a result, calling kthread_stop() on an unparked per-cpu kthread
> > > triggers such a warning:
> > >
> > > WARNING: CPU: 0 PID: 11 at kernel/kthread.c:525 __kthread_bind_mask kernel/kthread.c:525
> > > <TASK>
> > > kthread_stop+0x17a/0x630 kernel/kthread.c:707
> > > destroy_workqueue+0x136/0xc40 kernel/workqueue.c:5810
> > > wg_destruct+0x1e2/0x2e0 drivers/net/wireguard/device.c:257
> > > netdev_run_todo+0xe1a/0x1000 net/core/dev.c:10693
> > > default_device_exit_batch+0xa14/0xa90 net/core/dev.c:11769
> > > ops_exit_list net/core/net_namespace.c:178 [inline]
> > > cleanup_net+0x89d/0xcc0 net/core/net_namespace.c:640
> > > process_one_work kernel/workqueue.c:3231 [inline]
> > > process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
> > > worker_thread+0x86d/0xd70 kernel/workqueue.c:3393
> > > kthread+0x2f0/0x390 kernel/kthread.c:389
> > > ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> > > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> > > </TASK>
> > >
> > > Fix this with skipping unecessary unparking while stopping a kthread.
> >
> > How does userspace trigger this? Is it an issue in current mainline?
>
> I guess it takes some module unload performing a destroy workqueue to
> trigger this. And it's an issue in current mainline.
Cool.
> >
> > Should we backport the fix into -stable kernels (depends on the answers
> > to the above questions).
> >
> > It looks like the issue is old, so a Fixes: probably isn't needed. But
> > as the issue is old, why did it come to light now?
>
> It's hard to tell. The core of the issue is there for a long while but
> the conditions for it to really happen in practice is probably since:
>
> 5c25b5ff89f0 (workqueue: Tag bound workers with KTHREAD_IS_PER_CPU)
>
> So it might deserve a Fixes: actually.
OK, thsnks I added
Fixes: 5c25b5ff89f0 ("workqueue: Tag bound workers with KTHREAD_IS_PER_CPU")
Cc: <stable@xxxxxxxxxxxxxxx>
and it's queued for a 6.12-rcX merge.