Re: sched/fair: Kernel panics in pick_next_entity

From: Mike Galbraith
Date: Tue Oct 01 2024 - 04:32:09 EST


On Tue, 2024-10-01 at 00:45 +0530, Vishal Chourasia wrote:
> >
> for sanity, I ran the workload (kernel compilation) on the base commit
> where the kernel panic was initially observed, which resulted in a
> kernel panic, along with it couple of warnings where also printed on the
> console, and a circular locking dependency warning with it.
>
> Kernel 6.11.0-kp-base-10547-g684a64bf32b6 on an ppc64le
>
> ------------[ cut here ]------------
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.11.0-kp-base-10547-g684a64bf32b6 #69 Not tainted
> ------------------------------------------------------

...

> --- interrupt: 900
> se->sched_delayed
> WARNING: CPU: 1 PID: 27867 at kernel/sched/fair.c:6062 unthrottle_cfs_rq+0x644/0x660

...that warning also spells eventual doom for the box, here it does
anyway, running LTPs cfs_bandwidth01 testcase and hackbench together,
box grinds to a halt in pretty short order.

With the patchlet below (submitted), I can beat on box to my hearts
content without meeting throttle/unthrottle woes.

sched: Fix sched_delayed vs cfs_bandwidth

Meeting an unfinished DELAY_DEQUEUE treated entity in unthrottle_cfs_rq()
leads to a couple terminal scenarios. Finish it first, so ENQUEUE_WAKEUP
can proceed as it would have sans DELAY_DEQUEUE treatment.

Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
Reported-by: Venkat Rao Bagalkote <venkat88@xxxxxxxxxxxxxxxxxx>
Tested-by: Venkat Rao Bagalkote <venkat88@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Mike Galbraith <efault@xxxxxx>
---
kernel/sched/fair.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6058,10 +6058,13 @@ void unthrottle_cfs_rq(struct cfs_rq *cf
for_each_sched_entity(se) {
struct cfs_rq *qcfs_rq = cfs_rq_of(se);

- if (se->on_rq) {
- SCHED_WARN_ON(se->sched_delayed);
+ /* Handle any unfinished DELAY_DEQUEUE business first. */
+ if (se->sched_delayed) {
+ int flags = DEQUEUE_SLEEP | DEQUEUE_DELAYED;
+
+ dequeue_entity(qcfs_rq, se, flags);
+ } else if (se->on_rq)
break;
- }
enqueue_entity(qcfs_rq, se, ENQUEUE_WAKEUP);

if (cfs_rq_is_idle(group_cfs_rq(se)))