Re: [PATCH v2 10/10] sched/eevdf: Move to a single runqueue
From: John Stultz
Date: Wed May 13 2026 - 01:00:47 EST
On Tue, May 12, 2026 at 9:51 PM John Stultz <jstultz@xxxxxxxxxx> wrote:
>
> On Mon, May 11, 2026 at 5:07 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > Change fair/cgroup to a single runqueue.
> >
...
>
> I know Vincent was having some perf troubles with this patch, but
> booting on a 64 vCPU qemu environment, I'm seeing:
>
> [ 5.688490] Oops: divide error: 0000 [#1] SMP NOPTI
> [ 5.689457] CPU: 47 UID: 0 PID: 0 Comm: swapper/47 Not tainted
> 7.1.0-rc2-00026-g82a8ec6fb3f9 #38 PREEMPT(full)
> [ 5.689457] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.17.0-debian-1.17.0-1 04/01/2014
> [ 5.689457] RIP: 0010:wakeup_preempt_fair+0x1b7/0x430
> [ 5.689457] Code: 74 0b 48 8b 52 28 48 39 d0 48 0f 47 c2 48 8b b9
> 90 00 00 00 48 8b b1 08 01 00 00 48 81 ff 00 00 10 00 74 09 48 c1 e0
> 14 31 9
> [ 5.689457] RSP: 0000:ffffc9000021fd70 EFLAGS: 00010046
> [ 5.689457] RAX: 000002ab98000000 RBX: ffff8881b8e2db40 RCX: ffffffff83022a80
> [ 5.689457] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> [ 5.689457] RBP: 0000000000000001 R08: ffff88810cb14380 R09: ffffffff83022b00
> [ 5.689457] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
> [ 5.689457] R13: 0000000000000000 R14: ffff88810cb14300 R15: ffff8881b8e2da00
> [ 5.689457] FS: 0000000000000000(0000) GS:ffff888235c2e000(0000)
> knlGS:0000000000000000
> [ 5.689457] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 5.689457] CR2: 0000000000000000 CR3: 000000000304c001 CR4: 0000000000370ef0
> [ 5.689457] Call Trace:
> [ 5.689457] <TASK>
> [ 5.689457] wakeup_preempt+0xa8/0xd0
> [ 5.689457] attach_one_task+0xec/0x150
> [ 5.689457] __schedule+0x1ad8/0x21c0
> [ 5.689457] schedule_idle+0x22/0x40
> [ 5.689457] cpu_startup_entry+0x29/0x30
> [ 5.689457] start_secondary+0xf7/0x100
> [ 5.689457] common_startup_64+0x13e/0x148
> [ 5.689457] </TASK>
> [ 5.689457] Dumping ftrace buffer:
> [ 5.689457] (ftrace buffer empty)
> [ 5.689457] ---[ end trace 0000000000000000 ]---
> [ 5.689457] RIP: 0010:wakeup_preempt_fair+0x1b7/0x430
> [ 5.689457] Code: 74 0b 48 8b 52 28 48 39 d0 48 0f 47 c2 48 8b b9
> 90 00 00 00 48 8b b1 08 01 00 00 48 81 ff 00 00 10 00 74 09 48 c1 e0
> 14 31 9
> [ 5.689457] RSP: 0000:ffffc9000021fd70 EFLAGS: 00010046
> [ 5.689457] RAX: 000002ab98000000 RBX: ffff8881b8e2db40 RCX: ffffffff83022a80
> [ 5.689457] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> [ 5.689457] RBP: 0000000000000001 R08: ffff88810cb14380 R09: ffffffff83022b00
> [ 5.689457] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
> [ 5.689457] R13: 0000000000000000 R14: ffff88810cb14300 R15: ffff8881b8e2da00
> [ 5.689457] FS: 0000000000000000(0000) GS:ffff888235c2e000(0000)
> knlGS:0000000000000000
> [ 5.689457] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 5.689457] CR2: 0000000000000000 CR3: 000000000304c001 CR4: 0000000000370ef0
> [ 5.689457] Kernel panic - not syncing: Fatal exception
>
> Which I bisected down to this last patch in the series.
>
> faddr2line gave me:
> __calc_delta at kernel/sched/fair.c:290
> (inlined by) calc_delta_fair at kernel/sched/fair.c:300
> (inlined by) update_protect_slice at kernel/sched/fair.c:1070
> (inlined by) wakeup_preempt_fair at kernel/sched/fair.c:9193
>
> This usually trips as the ww_mutex selftest starts at bootup.
>
> Unfortunately I still see it with the add-on changes you proposed to K
> Prateek's feedback here.
>
> I'll try to narrow it down further tomorrow.
As karma would have it, this does seem to depend on CONFIG_SCHED_PROXY_EXEC. :)
I'm guessing the switch in calc_delta_fair() to use se->h_load is
uncovering something proxy isn't handling properly with that value.
But I'll have more tomorrow.
thanks
-john