Re: [PATCH v7 0/6] sched_ext: Support high-performance monotonically non-decreasing clock

From: Tejun Heo
Date: Tue Jan 07 2025 - 14:46:53 EST


Hello,

On Fri, Jan 03, 2025 at 12:16:41PM -1000, Tejun Heo wrote:
> On Mon, Dec 30, 2024 at 06:56:19PM +0900, Changwoo Min wrote:
> > Many BPF schedulers (such as scx_central, scx_lavd, scx_rusty, scx_bpfland,
> > and scx_flash) frequently call bpf_ktime_get_ns() for tracking tasks' runtime
> > properties. If supported, bpf_ktime_get_ns() eventually reads a hardware
> > timestamp counter (TSC). However, reading a hardware TSC is not
> > performant in some hardware platforms, degrading IPC.
> >
> > This patchset addresses the performance problem of reading hardware TSC
> > by leveraging the rq clock in the scheduler core, introducing a
> > scx_bpf_now() function for BPF schedulers. Whenever the rq clock
> > is fresh and valid, scx_bpf_now() provides the rq clock, which is
> > already updated by the scheduler core (update_rq_clock), so it can reduce
> > reading the hardware TSC.
> >
> > When the rq lock is released (rq_unpin_lock), the rq clock is invalidated,
> > so a subsequent scx_bpf_now() call gets the fresh sched_clock for the caller.
> >
> > In addition, scx_bpf_now() guarantees the clock is monotonically
> > non-decreasing for the same CPU, so the clock cannot go backward
> > in the same CPU.
> >
> > Using scx_bpf_now() reduces the number of reading hardware TSC
> > by 50-80% (76% for scx_lavd, 82% for scx_bpfland, and 51% for scx_rusty)
> > for the following benchmark:
>
> The patch series generally look good to me. Peter, if things look okay to
> you, I'll apply the series to sched_ext/for-6.14.

Applying to sched_ext/for-6.14. Please holler if there are concerns.

Thanks.

--
tejun