Re: [RFC PATCH] sched/ext: add sched_info_enqueue in dispatch path for schedstat
From: Tejun Heo
Date: Wed May 27 2026 - 15:34:18 EST
On Mon, May 25, 2026 at 09:19:42PM +0200, Fernand Sieber wrote:
> SCX tasks re-dispatched from BPF (e.g., after bandwidth throttling)
> bypass enqueue_task() and go directly through dispatch_enqueue().
> This skips sched_info_enqueue(), leaving last_queued at 0, which
> prevents run_delay from accumulating in /proc/<pid>/schedstat.
>
> Add sched_info_enqueue() in dispatch_enqueue() when last_queued is
> not already set. This ensures run_delay correctly reflects the time
> a task spends waiting for a CPU after being dispatched, including
> time spent in BPF-managed throttle queues.
>
> Without this fix, schedstat shows frozen run_delay values for SCX
> tasks that go through throttle/unthrottle cycles.
>
> Signed-off-by: Fernand Sieber <sieberf@xxxxxxxxxx>
> ---
> kernel/sched/ext.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 7ebdaf75d..827a96e39 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -1525,6 +1525,10 @@ static void dispatch_enqueue(struct scx_sched *sch, struct rq *rq,
> WARN_ON_ONCE((p->scx.dsq_flags & SCX_TASK_DSQ_ON_PRIQ) ||
> !RB_EMPTY_NODE(&p->scx.dsq_priq));
>
> + /* Track queue time for schedstat run_delay accounting */
> + if (!p->sched_info.last_queued)
> + sched_info_enqueue(task_rq(p), p);
I don't think this works. A DSQ can be used for throttling too and a BPF
data structure can be used for non-throttling queueing too. I don't see how
doing the above unconditionally would capture something meaningful reliably.
Thanks.
--
tejun