Re: [PATCH] RFC: sched: Rework task_sched_runtime to avoid calling update_rq_clock

From: John Stultz
Date: Wed Jun 12 2024 - 23:55:23 EST


On Wed, Jun 12, 2024 at 6:58 PM John Stultz <jstultz@xxxxxxxxxx> wrote:
>
> I recently got a bug report that
> clock_gettime(CLOCK_THREAD_CPUTIME_ID,...) had regressed between
> 5.10 and 6.1. Its not a huge regression in absolute time
> (~30-40ns), but is >10% change.
>
> I narrowed the cause down to the addition of
> psi_account_irqtime() in update_rq_clock_task(), in commit
> 52b1364ba0b1 ("sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ
> pressure")
>
> So that explains the behavior change, but it also seems odd that
> we're doing psi irq accounting from a syscall that is just
> trying to read the thread's cputime.
>
> Thinking about it more, it seems the re-use of update_rq_clock()
> to handle accounting for any in-progress time for the current
> task has the potential for side effects and unnecessary work.
>
> So instead rework the logic so we calculate the current cpu
> runtime in a read-only fashion.
>
> This has the side benefit of improving
> clock_gettime(CLOCK_THREAD_CPUTIME_ID,...) performance by ~12%
> over the behavior in 5.10, and ~21% over the 6.1 behavior.
>
> NOTE: I'm not 100% sure this is correct yet. There may be some
> edge cases I've overlooked, so I'd greatly appreciate any
> review or feedback.

Actually, I've definitely overlooked something as I'm seeing
inconsistencies in CLOCK_THREAD_CPUTIME_ID with testing.
I'll have to stare at this some more and resend once I've sorted it out.

Even so, I'd appreciate thoughts on the general approach of avoiding
update_rq_clock() in the clock_gettime() path.

thanks
-john