On Thu, Oct 10, 2024 at 03:06:21PM +0200, Peter Zijlstra wrote:
On Thu, Oct 10, 2024 at 09:03:16AM -0400, Johannes Weiner wrote:
I'll try to come up with a suitable solution as well, please don't
apply this one for now.
I'll make sure it doesn't end up in tip as-is.
Thanks.
This would be a replacement patch for #2 and #3 that handles migration
of delayed tasks. It's slightly more invasive on the psi callback
side, but I think it keeps the sched core bits simpler. Thoughts?
---
From d72a665d7c7c7d9c806424f473d13452754471d3 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@xxxxxxxxxxx>
Date: Thu, 10 Oct 2024 14:37:43 -0400
Subject: [PATCH] sched: psi: handle delayed-dequeue task migration
Since sched_delayed tasks remain queued even after blocking, the load
balancer can migrate them between runqueues while PSI considers them
to be asleep. As a result, it misreads the migration requeue followed
by a wakeup as a double queue:
psi: inconsistent task state! task=... cpu=... psi_flags=4 clear=. set=4
First, call psi_enqueue() after p->sched_class->enqueue_task(). A
wakeup will clear p->se.sched_delayed while a migration will not, so
psi can use that flag to tell them apart.
Then teach psi to migrate any "sleep" state when delayed-dequeue tasks
are being migrated.
Delayed-dequeue tasks can be revived by ttwu_runnable(), which will
call down with a new ENQUEUE_DELAYED. Instead of further complicating
the wakeup conditional in enqueue_task(), identify migration contexts
instead and default to wakeup handling for all other cases.
Debugged-by-and-original-fix-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>
Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
Closes: https://lore.kernel.org/lkml/20240830123458.3557-1-spasswolf@xxxxxx/
Closes: https://lore.kernel.org/all/cd67fbcd-d659-4822-bb90-7e8fbb40a856@xxxxxxxxxxxxx/
Link: https://lore.kernel.org/lkml/f82def74-a64a-4a05-c8d4-4eeb3e03d0c0@xxxxxxx/
Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
---
kernel/sched/core.c | 12 +++++------
kernel/sched/stats.h | 48 ++++++++++++++++++++++++++++++--------------
2 files changed, 39 insertions(+), 21 deletions(-)
[..snip..]