Re: [PATCH] sched/uclamp: Let each sched_class handle uclamp

From: Xuewen Yan
Date: Sun Mar 09 2025 - 22:42:02 EST


On Sat, Mar 8, 2025 at 2:32 AM Dietmar Eggemann
<dietmar.eggemann@xxxxxxx> wrote:
>
> On 06/03/2025 13:01, Xuewen Yan wrote:
> > On Thu, Mar 6, 2025 at 2:24 AM Dietmar Eggemann
> > <dietmar.eggemann@xxxxxxx> wrote:
> >>
> >> On 27/02/2025 14:54, Hongyan Xia wrote:
> >>
> >> [...]
> >>
> >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >>> index 857808da23d8..7e5a653811ad 100644
> >>> --- a/kernel/sched/fair.c
> >>> +++ b/kernel/sched/fair.c
> >>> @@ -6941,8 +6941,10 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> >>> * Let's add the task's estimated utilization to the cfs_rq's
> >>> * estimated utilization, before we update schedutil.
> >>> */
> >>> - if (!(p->se.sched_delayed && (task_on_rq_migrating(p) || (flags & ENQUEUE_RESTORE))))
> >>> + if (!(p->se.sched_delayed && (task_on_rq_migrating(p) || (flags & ENQUEUE_RESTORE)))) {
> >>> + uclamp_rq_inc(rq, p);
> >>> util_est_enqueue(&rq->cfs, p);
> >>> + }
> >>
> >> So you want to have p uclamp-enqueued so that its uclamp_min value
> >> counts for the cpufreq_update_util()/cfs_rq_util_change() calls later in
> >> enqueue_task_fair?
> >>
> >> if (p->in_iowait)
> >> cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT);
> >>
> >> enqueue_entity() -> update_load_avg() -> cfs_rq_util_change() ->
> >> cpufreq_update_util()
> >>
> >> But if you do this before requeue_delayed_entity() (1) you will not
> >> uclamp-enqueue p which got his ->sched_delayed just cleared in (1)?
> >>
> >
> > Could we change to the following:
> >
> > when enqueue:
> >
> > - if (!(p->se.sched_delayed && (task_on_rq_migrating(p) || (flags
> > & ENQUEUE_RESTORE))))
> > + if (!(p->se.sched_delayed && !(flags & ENQUEUE_DELAYED)))
>
> Why you want to check ENQUEUE_DELAYED as well here? Isn't
> !p->se.sched_delayed implying !ENQUEUE_DELAYED).

Indeed, the (!(p->se.sched_delayed && !(flags & ENQUEUE_DELAYED))) is equal to
the (!(p->se.sched_delayed && (task_on_rq_migrating(p) || (flags &
ENQUEUE_RESTORE)))).
I just think it might be easier to read using the ENQUEUE_DELAYED flag.
Because we only allow enq the uclamp and util_est when wake up the delayed-task.


>
> > + uclamp_rq_inc(rq, p);
> > util_est_enqueue(&rq->cfs, p);
> > + }
> >
> >
> > when dequeue:
> >
> > - if (!(p->se.sched_delayed && (task_on_rq_migrating(p) ||
> > (flags & DEQUEUE_SAVE))))
> > + if (!p->se.sched_delayed) {
> > + uclamp_rq_dec(rq, p);
> > util_est_dequeue(&rq->cfs, p);
> > + }
>
> So you want to exclude all delayed tasks from util_est and uclamp?

Yes, Because we had dequeued the task's uclamp and util_est when first sleeping。
This is achieved by putting the update of uclamp and util_est before
dequeue_entity.
For other dequeue, we should always ignore it. So there is no need to
judge the flags and migration.

BR