[PATCH 5.12 244/700] kthread_worker: fix return value when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()

From: Greg Kroah-Hartman
Date: Mon Jul 12 2021 - 03:33:08 EST


From: Petr Mladek <pmladek@xxxxxxxx>

[ Upstream commit d71ba1649fa3c464c51ec7163e4b817345bff2c7 ]

kthread_mod_delayed_work() might race with
kthread_cancel_delayed_work_sync() or another kthread_mod_delayed_work()
call. The function lets the other operation win when it sees
work->canceling counter set. And it returns @false.

But it should return @true as it is done by the related workqueue API, see
mod_delayed_work_on().

The reason is that the return value might be used for reference counting.
It has to distinguish the case when the number of queued works has changed
or stayed the same.

The change is safe. kthread_mod_delayed_work() return value is not
checked anywhere at the moment.

Link: https://lore.kernel.org/r/20210521163526.GA17916@xxxxxxxxxx
Link: https://lkml.kernel.org/r/20210610133051.15337-4-pmladek@xxxxxxxx
Signed-off-by: Petr Mladek <pmladek@xxxxxxxx>
Reported-by: Oleg Nesterov <oleg@xxxxxxxxxx>
Cc: Nathan Chancellor <nathan@xxxxxxxxxx>
Cc: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>
Cc: Tejun Heo <tj@xxxxxxxxxx>
Cc: Minchan Kim <minchan@xxxxxxxxxx>
Cc: <jenhaochen@xxxxxxxxxx>
Cc: Martin Liu <liumartin@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
kernel/kthread.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index e8da89c714f5..44f89a602b00 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -1161,14 +1161,14 @@ static bool __kthread_cancel_work(struct kthread_work *work)
* modify @dwork's timer so that it expires after @delay. If @delay is zero,
* @work is guaranteed to be queued immediately.
*
- * Return: %true if @dwork was pending and its timer was modified,
- * %false otherwise.
+ * Return: %false if @dwork was idle and queued, %true otherwise.
*
* A special case is when the work is being canceled in parallel.
* It might be caused either by the real kthread_cancel_delayed_work_sync()
* or yet another kthread_mod_delayed_work() call. We let the other command
- * win and return %false here. The caller is supposed to synchronize these
- * operations a reasonable way.
+ * win and return %true here. The return value can be used for reference
+ * counting and the number of queued works stays the same. Anyway, the caller
+ * is supposed to synchronize these operations a reasonable way.
*
* This function is safe to call from any context including IRQ handler.
* See __kthread_cancel_work() and kthread_delayed_work_timer_fn()
@@ -1180,13 +1180,15 @@ bool kthread_mod_delayed_work(struct kthread_worker *worker,
{
struct kthread_work *work = &dwork->work;
unsigned long flags;
- int ret = false;
+ int ret;

raw_spin_lock_irqsave(&worker->lock, flags);

/* Do not bother with canceling when never queued. */
- if (!work->worker)
+ if (!work->worker) {
+ ret = false;
goto fast_queue;
+ }

/* Work must not be used with >1 worker, see kthread_queue_work() */
WARN_ON_ONCE(work->worker != worker);
@@ -1204,8 +1206,11 @@ bool kthread_mod_delayed_work(struct kthread_worker *worker,
* be used for reference counting.
*/
kthread_cancel_delayed_work_timer(work, &flags);
- if (work->canceling)
+ if (work->canceling) {
+ /* The number of works in the queue does not change. */
+ ret = true;
goto out;
+ }
ret = __kthread_cancel_work(work);

fast_queue:
--
2.30.2