Re: [RFC][PATCH 08/10] sched/fair: Implement delayed dequeue

From: Peter Zijlstra
Date: Tue Jun 04 2024 - 15:12:43 EST


On Tue, Jun 04, 2024 at 03:23:41PM +0100, Luis Machado wrote:
> On 6/4/24 11:11, Peter Zijlstra wrote:

> > Note how dequeue_task() does uclamp_rq_dec() unconditionally, which is
> > then not balanced in the case below.
> >
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -3664,6 +3664,7 @@ static int ttwu_runnable(struct task_str
> > /* mustn't run a delayed task */
> > SCHED_WARN_ON(task_on_cpu(rq, p));
> > enqueue_task(rq, p, ENQUEUE_DELAYED);
> > + uclamp_rq_inc(rq, p);
> > }
> > if (!task_on_cpu(rq, p)) {
> > /*
>
> As Hongyan pointed out in a separate message, the above makes things
> worse, as we end up with even more leftover tasks in the uclamp
> buckets.
>
> I'm trying a fix in kernel/sched/core.c:enqueue_task that only
> calls uclamp_rq_inc if the task is not sched_delayed, so:
>
> - uclamp_rq_inc(rq, p);
> + if (!p->se.sched_delayed)
> + uclamp_rq_inc(rq, p);
>
> I'm not entirely sure it is correct, but it seems to fix things,
> but I'm still running some tests.
>
> With the current code, given uclamp_rq_inc and uclamp_rq_dec get
> called in enqueue_task and dequeue_task, the additional enqueue_task
> call from ttwu_runnable for a delayed_dequeue task may do an additional
> unconditional call to uclamp_rq_inc, no?

Yes, I got enqueue_task() and class->enqueue_task() confused this
morning.

But with the above, you skip inc for sched_delayed, but dequeue_task()
will have done the dec, so isn't it then still unbalanced?

Oh well, I'll go stare at this in tomorrow.

In any case, is there a uclamp self-test somewhere?