Re: [PATCH] RFC: sched: Rework task_sched_runtime to avoid calling update_rq_clock

From: John Stultz
Date: Sat Jun 15 2024 - 00:30:34 EST


On Fri, Jun 14, 2024 at 2:48 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Thu, Jun 13, 2024 at 12:51:42PM +0100, Qais Yousef wrote:
> > On 06/13/24 12:04, Peter Zijlstra wrote:
> > > @@ -5459,6 +5458,8 @@ void sched_tick(void)
> > >
> > > sched_clock_tick();
> > >
> > > + psi_account_irqtime(curr, &rq->psi_irq_time);
> > > +
> >
> > If wakeup preemption causes a context switch, wouldn't we lose this
> > information then? I *think* active migration might cause this information to be
> > lost too.
>
> I'm not sure what would be lost ?! the accounting is per cpu, not per
> task afaict. That said,...
>
> > pick_next_task() might be a better place to do the accounting?
>
> Additionally, when there has been an effective cgroup switch. Only on
> switch doesn't work for long running tasks, then the PSI information
> will be artitrarily long out of date.
>
> Which then gets me something like the (completely untested) below..
>
> Hmm?

I applied and booted with this. It still takes the accounting out of
the hotpath for the CLOCK_THREAD_CPUTIME_ID the microbenchmark
performance is back to 5.10 numbers.

I don't have any correctness tests for irqtime measurements, so I'll
have to try to work something up for that next week.

thanks
-john