[PATCH AUTOSEL 5.4 231/330] workqueue: Remove the warning in wq_worker_sleeping()

From: Sasha Levin
Date: Thu Sep 17 2020 - 22:06:06 EST


From: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>

[ Upstream commit 62849a9612924a655c67cf6962920544aa5c20db ]

The kernel test robot triggered a warning with the following race:
task-ctx A interrupt-ctx B
worker
-> process_one_work()
-> work_item()
-> schedule();
-> sched_submit_work()
-> wq_worker_sleeping()
-> ->sleeping = 1
atomic_dec_and_test(nr_running)
__schedule(); *interrupt*
async_page_fault()
-> local_irq_enable();
-> schedule();
-> sched_submit_work()
-> wq_worker_sleeping()
-> if (WARN_ON(->sleeping)) return
-> __schedule()
-> sched_update_worker()
-> wq_worker_running()
-> atomic_inc(nr_running);
-> ->sleeping = 0;

-> sched_update_worker()
-> wq_worker_running()
if (!->sleeping) return

In this context the warning is pointless everything is fine.
An interrupt before wq_worker_sleeping() will perform the ->sleeping
assignment (0 -> 1 > 0) twice.
An interrupt after wq_worker_sleeping() will trigger the warning and
nr_running will be decremented (by A) and incremented once (only by B, A
will skip it). This is the case until the ->sleeping is zeroed again in
wq_worker_running().

Remove the WARN statement because this condition may happen. Document
that preemption around wq_worker_sleeping() needs to be disabled to
protect ->sleeping and not just as an optimisation.

Fixes: 6d25be5782e48 ("sched/core, workqueues: Distangle worker accounting from rq lock")
Reported-by: kernel test robot <lkp@xxxxxxxxx>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Tejun Heo <tj@xxxxxxxxxx>
Link: https://lkml.kernel.org/r/20200327074308.GY11705@shao2-debian
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
kernel/sched/core.c | 3 ++-
kernel/workqueue.c | 6 ++++--
2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 352239c411a44..79ce22de44095 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4199,7 +4199,8 @@ static inline void sched_submit_work(struct task_struct *tsk)
* it wants to wake up a task to maintain concurrency.
* As this function is called inside the schedule() context,
* we disable preemption to avoid it calling schedule() again
- * in the possible wakeup of a kworker.
+ * in the possible wakeup of a kworker and because wq_worker_sleeping()
+ * requires it.
*/
if (tsk->flags & PF_WQ_WORKER) {
preempt_disable();
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 1a0c224af6fb3..4aa268582a225 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -864,7 +864,8 @@ void wq_worker_running(struct task_struct *task)
* @task: task going to sleep
*
* This function is called from schedule() when a busy worker is
- * going to sleep.
+ * going to sleep. Preemption needs to be disabled to protect ->sleeping
+ * assignment.
*/
void wq_worker_sleeping(struct task_struct *task)
{
@@ -881,7 +882,8 @@ void wq_worker_sleeping(struct task_struct *task)

pool = worker->pool;

- if (WARN_ON_ONCE(worker->sleeping))
+ /* Return if preempted before wq_worker_running() was reached */
+ if (worker->sleeping)
return;

worker->sleeping = 1;
--
2.25.1