Re: [PATCH] sched/idle: Make idle poll dynamic per-cpu

From: Peter Zijlstra
Date: Mon Jan 16 2023 - 03:53:51 EST


On Thu, Jan 12, 2023 at 05:24:26PM +0100, Daniel Bristot de Oliveira wrote:
> idle=poll is frequently used on ultra-low-latency systems. Examples of
> such systems are high-performance trading and 5G NVRAM. The performance
> gain is given by avoiding the idle driver machinery and by keeping the
> CPU is always in an active state - avoiding (odd) hardware heuristics that
> are out of the control of the OS.
>
> Currently, idle=poll is an all-or-nothing static option defined at
> boot time. The motivation for creating this option dynamic and per-cpu
> are two:
>
> 1) Reduce the power usage/heat by allowing only selected CPUs to
> do idle polling;
> 2) Allow multi-tenant systems (e.g., Kubernetes) to enable idle
> polling only when ultra-low-latency applications are present
> on specific CPUs.
>
> Joe Mario did some experiments with this option enabled, and the results
> were significant. For example, by using dynamic idle polling on
> selected CPUs, cyclictest performance is optimal (like when using
> idle=poll), but cpu power consumption drops from 381 to 233 watts.
>
> Also, limiting idle=poll to the set of CPUs that benefits from
> it allows other CPUs to benefit from frequency boosts. Joe also
> shows that the results can be in the order of 80nsec round trip
> improvement when system-wide idle=poll was not used.
>
> The user can enable idle polling with this command:
> # echo 1 > /sys/devices/system/cpu/cpu{CPU_ID}/idle_poll
>
> And disable it via:
> # echo 0 > /sys/devices/system/cpu/cpu{CPU_ID}/idle_poll
>
> By default, all CPUs have idle polling disabled (the current behavior).
> A static key avoids the CPU mask check overhead when no idle polling
> is enabled.

Urgh, can we please make this a cpuidle governor thing or so? So that we
don't need to invent new interfaces and such.