Re: [PATCH v6 0/6] sched_ext: Support high-performance monotonically non-decreasing clock

From: Andrea Righi
Date: Fri Dec 20 2024 - 17:29:38 EST

Next message: kernel test robot: "drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c:278:59: warning: '%u' directive output may be truncated writing between 1 and 10 bytes into a region of size 6"
Previous message: Moger, Babu: "Re: [PATCH v10 15/24] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC"
In reply to: Changwoo Min: "[PATCH v6 5/6] sched_ext: Replace bpf_ktime_get_ns() to scx_bpf_now_ns()"
Next in thread: Changwoo Min: "Re: [PATCH v6 0/6] sched_ext: Support high-performance monotonically non-decreasing clock"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Changwoo,

On Fri, Dec 20, 2024 at 03:20:19PM +0900, Changwoo Min wrote:
> Many BPF schedulers (such as scx_central, scx_lavd, scx_rusty, scx_bpfland,
> and scx_flash) frequently call bpf_ktime_get_ns() for tracking tasks' runtime
> properties. If supported, bpf_ktime_get_ns() eventually reads a hardware
> timestamp counter (TSC). However, reading a hardware TSC is not
> performant in some hardware platforms, degrading IPC.
>
> This patchset addresses the performance problem of reading hardware TSC
> by leveraging the rq clock in the scheduler core, introducing a
> scx_bpf_now_ns() function for BPF schedulers. Whenever the rq clock
> is fresh and valid, scx_bpf_now_ns() provides the rq clock, which is
> already updated by the scheduler core (update_rq_clock), so it can reduce
> reading the hardware TSC.
>
> When the rq lock is released (rq_unpin_lock), the rq clock is invalidated,
> so a subsequent scx_bpf_now_ns() call gets the fresh sched_clock for the caller.
>
> In addition, scx_bpf_now_ns() guarantees the clock is monotonically
> non-decreasing for the same CPU, so the clock cannot go backward
> in the same CPU.
>
> Using scx_bpf_now_ns() reduces the number of reading hardware TSC
> by 50-80% (76% for scx_lavd, 82% for scx_bpfland, and 51% for scx_rusty)
> for the following benchmark:
>
> perf bench -f simple sched messaging -t -g 20 -l 6000

I've tested this patch set and I haven't observed any significant
performance improvements (but also no regressions), even if the systems
I've tested are likely quite efficient at reading the hardware TSC.

I'm curious if we'd see a more significant difference in non-hardware
virtualized systems (i.e., qemu without kvm). Have you done any testing in
such environments already?

In any case:

Tested-by: Andrea Righi <arighi@xxxxxxxxxx>

-Andrea

Next message: kernel test robot: "drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c:278:59: warning: '%u' directive output may be truncated writing between 1 and 10 bytes into a region of size 6"
Previous message: Moger, Babu: "Re: [PATCH v10 15/24] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC"
In reply to: Changwoo Min: "[PATCH v6 5/6] sched_ext: Replace bpf_ktime_get_ns() to scx_bpf_now_ns()"
Next in thread: Changwoo Min: "Re: [PATCH v6 0/6] sched_ext: Support high-performance monotonically non-decreasing clock"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]