Re: sched/fair: Kernel panics in pick_next_entity

From: Mike Galbraith
Date: Wed Oct 02 2024 - 02:41:56 EST


On Tue, 2024-10-01 at 18:41 +0200, Mike Galbraith wrote:
>
> When I hit $subject, LTPs cfs_bandwidth01 was running, but there was no
> warning prelude, box went straight to panic.  Trying to reproduce using
> that testcase plus hackbench as efficacy booster produced lots of dying
> box noise, but zero sneaky $subject instances before or after quash.

Hohum, this morning I did hit..

1. WARNING: CPU: 5 PID: 931 at kernel/sched/fair.c:6062 unthrottle_cfs_rq+0x4c3/0x4d0
2. WARNING: CPU: 0 PID: 786 at kernel/sched/fair.c:704 update_entity_lag+0x79/0x90
3. NULL dereference in pick_next_entity()

..instead of brick, workqueue stall etc. Twice. Not that it matters.
I was only mucking about with it because I was curious whether telling
LB to stop moving sched_delayed tasks about would matter. (nope)

-Mike