[PATCH 2/3] sched/core: Add ENQUEUE_WAKEUP flag alongside ENQUEUE_DELAYED

From: K Prateek Nayak
Date: Thu Oct 10 2024 - 04:47:14 EST


With the fixup in dequeuing of PSI signals for delayed tasks, a new
inconsistent PSI task state splat was discovered during boot similar to:

psi: inconsistent task state! task=... cpu=... psi_flags=5 clear=4 set=1

Tracking the PSI changes along with task's state revealed the following
series of events:

psi_task_switch: psi_flags=4 clear=4 set=1 # sched_delayed is set to 1
psi_enqueue: psi_flags=1 clear=0 set=4 # requeue of delayed entity via ENQUEUE_DELAYED
psi_task_switch: psi_flags=5 clear=4 set=1 # task is blocked again but 1 is already set
psi: inconsistent task state! task=... cpu=... psi_flags=5 clear=4 set=1

The TSK_IOWAIT flag was never cleared onrequeue since psi_enqueue() only
clears it on a "wakeup" which, in term of enqueue flags, is defined as:

(flags & ENQUEUE_WAKEUP) && !(flags & ENQUEUE_MIGRATED)

Add ENQUEUE_WAKEUP alongside ENQUEUE_DELAYED for requeue through
ttwu_runnable(). psi_enqueue() is the only observer of this flag in the
requeue path and it pairs with the DEQUEUE_SLEEP in block_task().

Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
Closes: https://lore.kernel.org/lkml/20240830123458.3557-1-spasswolf@xxxxxx/
Closes: https://lore.kernel.org/all/cd67fbcd-d659-4822-bb90-7e8fbb40a856@xxxxxxxxxxxxx/
Link: https://lore.kernel.org/lkml/f82def74-a64a-4a05-c8d4-4eeb3e03d0c0@xxxxxxx/
Tested-by: Johannes Weiner <hannes@xxxxxxxxxxx>
Signed-off-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>
---
kernel/sched/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 88cbfc671fb6..52be38021ebb 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3733,7 +3733,7 @@ static int ttwu_runnable(struct task_struct *p, int wake_flags)
if (task_on_rq_queued(p)) {
update_rq_clock(rq);
if (p->se.sched_delayed)
- enqueue_task(rq, p, ENQUEUE_NOCLOCK | ENQUEUE_DELAYED);
+ enqueue_task(rq, p, ENQUEUE_NOCLOCK | ENQUEUE_WAKEUP | ENQUEUE_DELAYED);
if (!task_on_cpu(rq, p)) {
/*
* When on_rq && !on_cpu the task is preempted, see if
--
2.34.1