Re: [v6.12] WARNING: at kernel/sched/deadline.c:1995 enqueue_dl_entity (task blocked for more than 28262 seconds)

From: Peter Zijlstra
Date: Mon Dec 09 2024 - 05:55:28 EST


On Fri, Dec 06, 2024 at 11:57:30AM -0500, Vineeth Remanan Pillai wrote:

> I was able to reproduce this WARN_ON couple of days back with
> syzkaller. dlserver's dl_se gets enqueued during a update_curr while
> the dlserver is stopped. And subsequent dlserver start will cause a
> double enqueue.

Right, I spotted that hole late last week. There is this thread:

https://lore.kernel.org/all/20241209094941.GF21636@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#u

Where I just added this thunk:

@@ -1674,6 +1679,12 @@ void dl_server_start(struct sched_dl_entity *dl_se)

void dl_server_stop(struct sched_dl_entity *dl_se)
{
+ if (current->dl_server == dl_se) {
+ struct rq *rq = rq_of_dl_se(dl_se);
+ trace_printk("stop fair server %d\n", cpu_of(rq));
+ current->dl_server = NULL;
+ }
+
if (!dl_se->dl_runtime)
return;

Which was my attempt at plugging said hole. But since I do not have
means of reproduction, I'm not at all sure it is sufficient :/