Re: [PATCH v2 10/10] sched/eevdf: Move to a single runqueue
From: Zhang Qiao
Date: Tue May 26 2026 - 03:54:55 EST
Hi Peter,
在 2026/5/11 19:31, Peter Zijlstra 写道:
> @@ -13729,14 +13616,20 @@ static inline void task_tick_core(struct
> */
> static void task_tick_fair(struct rq *rq, struct task_struct *curr, int queued)
> {
> - struct cfs_rq *cfs_rq;
> struct sched_entity *se = &curr->se;
> + unsigned long weight = NICE_0_LOAD;
> + struct cfs_rq *cfs_rq;
>
> for_each_sched_entity(se) {
> cfs_rq = cfs_rq_of(se);
> entity_tick(cfs_rq, se, queued);
> +
> + weight = __calc_prop_weight(cfs_rq, se, weight);
Testing sched/flat branch on AMD EPYC 9654 (384 CPUs, 8 NUMA nodes)
with a 2-level cgroup hierarchy and cfs_bandwidth quota enabled,
hackbench triggers a divide-by-zero oops:
[ 142.308571] divide error: 0000 [#1] SMP NOPTI
[ 142.308582] RIP: 0010:task_tick_fair+0x19e/0x410
[ 142.308601] Call Trace:
[ 142.308604] <IRQ>
[ 142.308607] scheduler_tick+0x6a/0x110
[ 142.308609] update_process_times+0x6b/0x90
[ 142.308611] tick_sched_handle+0x2a/0x70
[ 142.308613] tick_sched_timer+0x57/0xb0
faddr2line confirms:
task_tick_fair+0x19e/0x410:
__calc_prop_weight at kernel/sched/fair.c:4085
(inlined by) task_tick_fair at kernel/sched/fair.c:13576
===========================================================
Reproduction
===========================================================
Kernel: sched/flat branch (54d493980e00 and later)
Hardware: AMD EPYC 9654, 2S 384 logical CPUs
# 2-level cgroup, quota = 50% of one period
cgcreate -g cpu:/bw/l1/l2
cgset -r cpu.cfs_quota_us=50000 /bw/l1/l2
cgset -r cpu.cfs_period_us=100000 /bw/l1/l2
# high task count amplifies the throttle→tick race window
cgexec -g cpu:/bw/l1/l2 hackbench -g 48 -l 1000 -s 512 -T
Typically crashes within 30 seconds on this machine. A single-CPU
kernel or a very loose quota (e.g. 90%) is unlikely to trigger it
because the race window is narrow.
Thanks,
Zhang Qiao
> }
>
> + se = &curr->se;
> + reweight_eevdf(cfs_rq, se, weight, se->on_rq);
> +
> if (queued)
> return;
>
> @@ -13772,7 +13665,7 @@ prio_changed_fair(struct rq *rq, struct
> if (p->prio == oldprio)
> return;
>
> - if (rq->cfs.nr_queued == 1)
> + if (rq->cfs.h_nr_queued == 1)
> return;
>
> /*
> @@ -13901,29 +13794,40 @@ static void switched_to_fair(struct rq *
> }
> }
>
>
>
>
>
>
> .
>