Re: [tip:sched/core] [sched/fair] 79104becf4: BUG:kernel_NULL_pointer_dereference,address

From: Fernand Sieber

Date: Tue Nov 04 2025 - 16:05:43 EST


Hi Peter,

I spent some time today investigating this report. The crash happens when
a proxy task yields.

Since it probably doesn't make sense that a task blocking the best pick
yields, a simple workaround is to ignore the yield in this case:

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8993,6 +8993,11 @@ static void yield_task_fair(struct rq *rq)
if (unlikely(rq->nr_running == 1))
return;

+ /* Don't yield if we're running a proxy task */
+ if (rq->donor && rq->donor != curr) {
+ return;
+ }
+

However, more generally, I am not sure that the logic in update_min_vruntime()
is sound when we are running a proxy task, which I suspect is the ultimate
root cause of the problem. It seems to assume that cfs_rq->curr is the
running task, which is not the case.

In my troubleshooting I have seen inconsistent calculations with underflows
of cfs_rq->avg_vruntime and avg_vruntime(cfs_rq) being lower than
min_vruntime. I'll see if I can invest more time diving into this, in the
meantime do you have any thoughts?

Thanks,
--Fernand



Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07