Re: [PATCH] sched/fair: Fix DELAY_DEQUEUE issue related to cgroup throttling

From: Han Guangjiang

Date: Tue Sep 30 2025 - 21:15:41 EST


>> From: Han Guangjiang <hanguangjiang@xxxxxxxxxxx>
>>
>> When both CPU cgroup and memory cgroup are enabled with parent cgroup
>> resource limits much smaller than child cgroup's, the system frequently
>> hangs with NULL pointer dereference:
>>
> Is this the same issue as here:
>
> https://lore.kernel.org/all/105ae6f1-f629-4fe7-9644-4242c3bed035@xxxxxxx/T/#u
>
> ?

Yes, based on the patch modifications, I believe this is the same issue.
When dequeue_entities() is executed on a delay_dequeued task while the
cgroup is being throttled, it returns early and misses the
__block_task() operation on the task. This leads to inconsistency
between p->on_rq and se->on_rq.

When PI or scheduler switching occurs, the second dequeue_entities()
call assumes the task is still in the CFS scheduler, but in reality
it is no longer there.

By the way, I have a question about the hrtick_update() in
dequeue_entities(). Should it be changed to:

dequeue_entities()
{
...
if (p) {
hrtick_update(rq);
}
...
}

And remove hrtick_update() from dequeue_task_fair()?
Because for dequeue_delayed tasks, hrtick_update() will be executed
twice in this proces.

Also, should the return type of dequeue_entities() be changed to
match dequeue_task_fair(), where true means the task was actually
removed from the queue, and false means it was delay dequeued?

Thanks,
Han Guangjiang