Re: [PATCH v3 2/2] sched: update the rq->avg_idle when a task is moved to an idle CPU

Next message: Daniel Palmer: "Re: [PATCH] m68k: implement runtime consts"
Previous message: david laight: "Re: [RFC/RFT PATCH 3/6] random: Use u32 to keep track of batched entropy generation"
In reply to: Huang Shijie: "[PATCH v3 2/2] sched: update the rq-&gt;avg_idle when a task is moved to an idle CPU"
Next in thread: Shijie Huang: "Re: [PATCH v3 2/2] sched: update the rq-&gt;avg_idle when a task is moved to an idle CPU"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: K Prateek Nayak

Date: Thu Nov 27 2025 - 05:12:39 EST

Hello Huang Shijie,

On 11/27/2025 2:44 PM, Huang Shijie wrote:
> void enqueue_task(struct rq *rq, struct task_struct *p, int flags)
> {
> + int delayed = p->se.sched_delayed;
> +
> if (!(flags & ENQUEUE_NOCLOCK))
> update_rq_clock(rq);
>
> @@ -2100,6 +2117,13 @@ void enqueue_task(struct rq *rq, struct task_struct *p, int flags)
>
> if (sched_core_enabled(rq))
> sched_core_enqueue(rq, p);
> +
> + if (delayed) {
> + if (entity_eligible(cfs_rq_of(&p->se), &p->se))
> + update_rq_avg_idle(rq);

Question: Why do we want to treat the delayed case like this?

If entity is not eligible, we want to consider that it hasn't
even gone through a wakeup? Wouldn't this lead to the next
wakeup seeing rq->idle_stamp to be non-zero and inaccurately
account more idle time?

Also if we've done newidle balance and the rq->idle_stamp is
set, we cannot have delayed tasks since pick_next_task() would
have dequeued all delayed tasks before reaching newidle
balance.

Just doing a update_rq_avg_idle() unconditionally should be
fine.

> + } else {
> + update_rq_avg_idle(rq);
> + }
> }
>
> /*
--
Thanks and Regards,
Prateek