Re: [PATCH v2 4/5] sched: Add task enqueue/dequeue trace points

From: Nam Cao
Date: Thu Aug 21 2025 - 03:07:03 EST


On Fri, Aug 15, 2025 at 03:52:12PM +0200, Peter Zijlstra wrote:
> On Fri, Aug 15, 2025 at 03:40:17PM +0200, Peter Zijlstra wrote:
> > On Wed, Aug 06, 2025 at 10:01:20AM +0200, Nam Cao wrote:
> >
> > > +/*
> > > + * The two trace points below may not work as expected for fair tasks due
> > > + * to delayed dequeue. See:
> > > + * https://lore.kernel.org/lkml/179674c6-f82a-4718-ace2-67b5e672fdee@xxxxxxx/
> > > + */
> >
> > > +DECLARE_TRACE(dequeue_task,
> > > + TP_PROTO(int cpu, struct task_struct *task),
> > > + TP_ARGS(cpu, task));
> > > +
> >
> > > @@ -2119,7 +2121,11 @@ inline bool dequeue_task(struct rq *rq, struct task_struct *p, int flags)
> > > * and mark the task ->sched_delayed.
> > > */
> > > uclamp_rq_dec(rq, p);
> > > - return p->sched_class->dequeue_task(rq, p, flags);
> > > + if (p->sched_class->dequeue_task(rq, p, flags)) {
> > > + trace_dequeue_task_tp(rq->cpu, p);
> > > + return true;
> > > + }
> > > + return false;
> > > }
> >
> > Hurmpff.. that's not very nice.
> >
> > How about something like:
> >
> > dequeue_task():
> > ...
> > ret = p->sched_class->dequeue_task(rq, p, flags);
> > if (trace_dequeue_task_p_enabled() && !(flags & DEQUEUE_SLEEP))
> > __trace_dequeue_task_tp(rq->cpu, p);
> > return ret;
> >
> >
> > __block_task():
> > trace_dequeue_task_tp(rq->cpu, p);
> > ...
> >
> >
> > Specifically, only DEQUEUE_SLEEP is allowed to fail, and DEQUEUE_SLEEP
> > will eventually cause __block_task() to be called, either directly, or
> > delayed.
>
> If you extend the tracepoint with the sleep state, you can probably
> remove the nr_running tracepoints. Esp. once we get this new throttle
> stuff sorted.

Sorry, I'm a bit out of depth here. Can you elaborate?

By "sleep state" do you mean (flags & DEQUEUE_SLEEP)? The nr_running
tracepoints are not hit if the task is throttled, while these new
tracepoints are hit. How does the sleep state help with this difference?

Also +Cc Phil Auld <pauld@xxxxxxxxxx>, who seems to care about the
nr_running tracepoints.

Nam