Re: [tip:sched/core] [sched/fair] 79104becf4: BUG:kernel_NULL_pointer_dereference,address
From: Fernand Sieber
Date: Tue Nov 04 2025 - 16:05:43 EST
Hi Peter,
I spent some time today investigating this report. The crash happens when
a proxy task yields.
Since it probably doesn't make sense that a task blocking the best pick
yields, a simple workaround is to ignore the yield in this case:
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8993,6 +8993,11 @@ static void yield_task_fair(struct rq *rq)
if (unlikely(rq->nr_running == 1))
return;
+ /* Don't yield if we're running a proxy task */
+ if (rq->donor && rq->donor != curr) {
+ return;
+ }
+
However, more generally, I am not sure that the logic in update_min_vruntime()
is sound when we are running a proxy task, which I suspect is the ultimate
root cause of the problem. It seems to assume that cfs_rq->curr is the
running task, which is not the case.
In my troubleshooting I have seen inconsistent calculations with underflows
of cfs_rq->avg_vruntime and avg_vruntime(cfs_rq) being lower than
min_vruntime. I'll see if I can invest more time diving into this, in the
meantime do you have any thoughts?
Thanks,
--Fernand
Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07