Hi Prateek,
On Wed, Mar 26, 2025 at 12:54 AM K Prateek Nayak <kprateek.nayak@xxxxxxx> wrote:
Hello Xuewen,
On 3/25/2025 7:17 AM, Xuewen Yan wrote:
When task's uclamp is set, we hope that the CPU frequency
can increase as quickly as possible when the task is enqueued.
Because the cpu frequency updating happens during the enqueue_task(),
so the rq's uclamp needs to be updated before the task is enqueued,
just like util_est.
So, aline the uclamp and util_est and call before freq update.
For sched-delayed tasks, the rq uclamp/util_est should only be updated
when they are enqueued upon being awakened.
So simply the logic of util_est's enqueue/dequeue check.
Signed-off-by: Xuewen Yan <xuewen.yan@xxxxxxxxxx>
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c798d2795243..c92fee07fb7b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6930,7 +6930,7 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
* Let's add the task's estimated utilization to the cfs_rq's
* estimated utilization, before we update schedutil.
*/
- if (!(p->se.sched_delayed && (task_on_rq_migrating(p) || (flags & ENQUEUE_RESTORE))))
+ if (!p->se.sched_delayed || (flags & ENQUEUE_DELAYED))
util_est_enqueue(&rq->cfs, p);
Wouldn't this do a util_est_{dequeue,enqueue}() for a save restore
operation too of a non-delayed task? Is that desired?
For delayed-task, its util_est should dequeue/enqueue only for its
sleeping and waking up,
For the save restore operation, there is no need to enqueue it,
because it is not woken up.
So the condition of enqueue actually is:
if (!p->se.sched_delayed || (p->se.sched_delayed && (flags & ENQUEUE_DELAYED)))
And, this is equal to :
if (!p->se.sched_delayed || (flags & ENQUEUE_DELAYED))
More details here:
https://lore.kernel.org/all/84441660bef0a5e67fd09dc3787178d0276dad31.1740664400.git.hongyan.xia2@xxxxxxx/T/#ma2505e90489316eb354390b42dee9d053f6fd1e9
On a larger note ...
An enqueue of a delayed task will call requeue_delayed_entity() which
will only enqueue p->se on its cfs_rq and do an update_load_avg() for
that cfs_rq alone.
With cgroups enabled, this cfs_rq might not be the root cfs_rq and
cfs_rq_util_change() will not call cpufreq_update_util() leaving the
CPU running at the older frequency despite the updated uclamp
constraints.
If think cfs_rq_util_change() should be called for the root cfs_rq
when a task is delayed or when it is re-enqueued to re-evaluate
the uclamp constraints.
I think you're referring to a different issue with the delayed-task's
util_ets/uclamp.
This issue is unrelated to util-est and uclamp, because even without
these two features, the problem you're mentioning still exists.
Specifically, if the delayed-task is not the root CFS task, the CPU
frequency might not be updated in time when the delayed-task is
enqueued.
Maybe we could add the update_load_avg() in clear_delayed to solve the issue?
-->8--
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a0c4cd26ee07..c75d50dab86b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5435,6 +5435,7 @@ static void clear_delayed(struct sched_entity *se)
for_each_sched_entity(se) {
struct cfs_rq *cfs_rq = cfs_rq_of(se);
+ update_load_avg(cfs_rq, se, UPDATE_TG);
cfs_rq->h_nr_runnable++;
if (cfs_rq_throttled(cfs_rq))
break;
---
BR
xuewen