Re: [PATCH] RFC: sched: Rework task_sched_runtime to avoid calling update_rq_clock

From: Qais Yousef
Date: Thu Jun 13 2024 - 07:58:05 EST


On 06/13/24 12:04, Peter Zijlstra wrote:

> ---
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 0935f9d4bb7b..d4b87539d72a 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -724,7 +724,6 @@ static void update_rq_clock_task(struct rq *rq, s64 delta)
>
> rq->prev_irq_time += irq_delta;
> delta -= irq_delta;
> - psi_account_irqtime(rq->curr, irq_delta);
> delayacct_irq(rq->curr, irq_delta);
> #endif
> #ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
> @@ -5459,6 +5458,8 @@ void sched_tick(void)
>
> sched_clock_tick();
>
> + psi_account_irqtime(curr, &rq->psi_irq_time);
> +

If wakeup preemption causes a context switch, wouldn't we lose this
information then? I *think* active migration might cause this information to be
lost too.

pick_next_task() might be a better place to do the accounting?

> rq_lock(rq, &rf);
>
> update_rq_clock(rq);