Re: [PATCH v2 1/7] sched/fair: Fix zero_vruntime tracking

From: K Prateek Nayak

Date: Tue Mar 31 2026 - 00:59:19 EST


On 3/31/2026 6:08 AM, K Prateek Nayak wrote:
>> I'm thinking that if you have two groups, and the tick always hits the
>> one group, the other group can go a while without ever getting updated.
>
> Ack! That could be but I only have once cgroup on top of root cgroup as
> far as cpu controllers are concerned so the sched_yield() catching up
> the avg_vruntime() should have worked. Either ways, I have more data:
>
> When I hit the overflow warning, I have:
>
> se: entity_key(-83106064385) weight(90891264) overflow(-7553615238018032640)
> cfs_rq: zero_vruntime(138430453113448575) sum_w_vruntime(0) sum_weight(0)
> cfs_rq->curr: entity_key(0) vruntime(138430453113448575) deadline(138430500540426854)
> Post avg_vruntime():
> se: entity_key(-83106064385) weight(90891264) overflow(-7553615238018032640)
> cfs_rq: zero_vruntime(138430453113448575) sum_w_vruntime(0) sum_weight(0)
> cfs_rq->curr: entity_key(0) vruntime(138430453113448575) deadline(138430500540426854)
>
> so running avg_vruntime() doesn't make a difference and it seems to be a
> genuine case of place_entity() putting the newly woken entity pretty
> far back in the timeline. (I forgot to print weights!)
>
> Now, the funny part is, if I leave the system undisturbed, I get a few
> of the above warning and nothing interesting but as soon as I do a:
>
> grep bits /sys/kernel/debug/sched/debug
>
> Boom! Pick fails very consistently (Because of copy-pasta this too
> doesn't contain weights):
>
> NULL Pick!
> cfs_rq: zero_vruntime(89029406877992895) sum_w_vruntime(-135049248768) sum_weight(1048576)
> cfs_rq->curr: entity_key(149162) vruntime(89029406878142057) deadline(89029406976268435)
> queued se: entity_key(-123294) vruntime(89029406877869601) deadline(89029406880669601)
>
> after avg_vruntime()!
> cfs_rq: zero_vruntime(89029406877868114) sum_w_vruntime(-4206886912) sum_weight(1048576)
> cfs_rq->curr: entity_key(273943) vruntime(89029406878142057) deadline(89029406976268435)
> queued se: entity_key(1487) vruntime(89029406877869601) deadline(89029406880669601)
>
> NULL Pick!
>
> The above doesn't recover after a avg_vruntime(). Btw I'm running:
>
> nice -n 19 stress-ng --yield 32 -t 1000000s&
> while true; do perf bench sched messaging -p -t -l 100000 -g 16; done
>
> Nice 19 is to get a large deadline and keep catching up to that deadline
> at every yield to see if that makes any difference.
>
>>
>> But if there's no cgroups, this can't be it.
>>
>> Anyway, something like the below would rule this out I suppose.
>
> I'll add that in and see if it makes a difference. I'll add in
> weights and look at place_entity() to see if we have anything
> interesting going on there.

Still trips the issue :-( This time I have logs with weights.

For the warning:

se: entity_key(-72358759771) weight(90891264) warning_mul(-6576779137058540544) vlag(39009) delayed?(0)
cfs_rq: zero_vruntime(18695504496613622) sum_w_vruntime(0) sum_weight(0)
cfs_rq->curr: entity_key(0) vruntime(18695504496613622) deadline(18695540588878716) weight(49)
Post avg_vruntime():
se: entity_key(-72358759771) weight(90891264) overflow?(-6576779137058540544)
cfs_rq: zero_vruntime(18695504496613622) sum_w_vruntime(0) sum_weight(0)
cfs_rq->curr: entity_key(0) vruntime(18695504496613622) deadline(18695540588878716) weight(49)


And the NULL pick while reading debugfs (probably something in the initial
task wakeup path that trips it?):

NULL Pick!
cfs_rq: zero_vruntime(21126236598445952) sum_w_vruntime(-1074569456640) sum_weight(15360)
cfs_rq->curr: entity_key(69958950) vruntime(21126236668404902) deadline(21126236859551568) weight(15360)
queued se: entity_key(32498584) vruntime(21126236630944536) deadline(21126236822091202) weight(15360)

After avg_vruntime():
cfs_rq: zero_vruntime(21126236598445952) sum_w_vruntime(-1074569456640) sum_weight(15360)
cfs_rq->curr: entity_key(69958950) vruntime(21126236668404902) deadline(21126236859551568) weight(15360)
queued se: entity_key(32498584) vruntime(21126236630944536) deadline(21126236822091202) weight(15360)
NULL Pick!

Updated zero_vruntime is behind that of either of the queued entities.
Now that I have a reliable trigger for the crash, I'll just start
tracing everything before I run grep (although I suspect something may
have gone bad a long time ago but we can be hopeful)

--
Thanks and Regards,
Prateek