Re: [PATCH v2 0/8] uclamp sum aggregation

From: Hongyan Xia
Date: Mon Mar 10 2025 - 08:55:14 EST


On 10/03/2025 11:34, Dietmar Eggemann wrote:
On 06/03/2025 12:38, Xuewen Yan wrote:
On Thu, Mar 6, 2025 at 7:32 PM Hongyan Xia <hongyan.xia2@xxxxxxx> wrote:

Hi Xuewen,

On 06/03/2025 11:12, Xuewen Yan wrote:
Hi Hongyan,

On Tue, Mar 4, 2025 at 10:26 PM Hongyan Xia <hongyan.xia2@xxxxxxx> wrote:

[...]

Subject: [PATCH] sched/uclamp: Update the rq's uclamp before enqueue task

When task's uclamp is set, we hope that the CPU frequency
can increase as quickly as possible when the task is enqueued.
Because the cpu frequency updating happens during the enqueue_task(),
so the rq's uclamp needs to be updated before the task is enqueued.
For sched-delayed tasks, the rq uclamp should only be updated
when they are enqueued upon being awakened.

Signed-off-by: Xuewen Yan <xuewen.yan@xxxxxxxxxx>
---
kernel/sched/core.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 67189907214d..b07e78910221 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1747,7 +1747,7 @@ static inline void uclamp_rq_dec_id(struct rq
*rq, struct task_struct *p,
}
}

-static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p)
+static inline void uclamp_rq_inc(struct rq *rq, struct task_struct
*p, int flags)
{
enum uclamp_id clamp_id;

@@ -1763,7 +1763,8 @@ static inline void uclamp_rq_inc(struct rq *rq,
struct task_struct *p)
if (unlikely(!p->sched_class->uclamp_enabled))
return;

- if (p->se.sched_delayed)
+ /* Only inc the delayed task which is being woken up. */
+ if (p->se.sched_delayed && !(flags & ENQUEUE_DELAYED))
return;

for_each_clamp_id(clamp_id)
@@ -2031,7 +2032,7 @@ static void __init init_uclamp(void)
}

#else /* !CONFIG_UCLAMP_TASK */
-static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) { }
+static inline void uclamp_rq_inc(struct rq *rq, struct task_struct
*p, int flags) { }
static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p) { }
static inline void uclamp_fork(struct task_struct *p) { }
static inline void uclamp_post_fork(struct task_struct *p) { }
@@ -2067,12 +2068,9 @@ void enqueue_task(struct rq *rq, struct
task_struct *p, int flags)
if (!(flags & ENQUEUE_NOCLOCK))
update_rq_clock(rq);

+ uclamp_rq_inc(rq, p, flags);
+
p->sched_class->enqueue_task(rq, p, flags);
- /*
- * Must be after ->enqueue_task() because ENQUEUE_DELAYED can clear
- * ->sched_delayed.
- */
- uclamp_rq_inc(rq, p);

psi_enqueue(p, flags);

Like I mentioned already in the original thread:

https://lkml.kernel.org/r/65365ec7-6a16-4e66-8005-e78788cbedfa@xxxxxxx

I would prefer that uclamp stays in core.c. ENQUEUE_DELAYED among all
the other flags is already used there (ttwu_runnable()).

task_struct contains sched_{,rt_,dl_}entity}. We just have to be
careful when switching policies.

--

Could you also incorporate the changes in {en,de}queue_task_fair()
((task_on_rq_migrating(p) || (flags & {RESTORE,DEQUEUE}_SAVE))) vs.
(!p->se.sched_delayed || (flags & ENQUEUE_DELAYED)) and
(!p->se.sched_delayed) so the uclamp-util_est relation is easier to spot?

[...]

At the moment we can't do this. Sum aggregation was designed before delayed dequeue and it syncs with p->se.on_rq. If we sync with something else and take care of delayed dequeue cases (like util_est) then I have to rewrite part of the series.