Re: [RFC PATCH v2 0/8] Introduct cpu idle prediction functionality

From: Rafael J. Wysocki
Date: Fri Oct 13 2017 - 21:24:13 EST


On Saturday, September 30, 2017 9:20:26 AM CEST Aubrey Li wrote:
> We found under some latency intensive workloads, short idle periods occurs
> very common, then idle entry and exit path starts to dominate, so it's
> important to optimize them. To determine the short idle pattern, we need
> to figure out how long of the coming idle and the threshold of the short
> idle interval.
>
> A cpu idle prediction functionality is introduced in this proposal to catch
> the short idle pattern.
>
> Firstly, we check the IRQ timings subsystem, if there is an event
> coming soon.
> -- https://lwn.net/Articles/691297/
>
> Secondly, we check the idle statistics of scheduler, if it's likely we'll
> go into a short idle.
> -- https://patchwork.kernel.org/patch/2839221/
>
> Thirdly, we predict the next idle interval by using the prediction
> fucntionality in the idle governor if it has.
>
> For the threshold of the short idle interval, we record the timestamps of
> the idle entry, and multiply by a tunable parameter at here:
> -- /proc/sys/kernel/fast_idle_ratio
>
> We use the output of the idle prediction to skip turning tick off if a
> short idle is determined in this proposal. Reprogramming hardware timer
> twice(off and on) is expensive for a very short idle. There are some
> potential optimizations can be done according to the same indicator.
>
> I observed when system is idle, the idle predictor reports 20/s long idle
> and ZERO fast idle on one CPU. And when the workload is running, the idle
> predictor reports 72899/s fast idle and ZERO long idle on the same CPU.
>
> Aubrey Li (8):
> cpuidle: menu: extract prediction functionality
> cpuidle: record the overhead of idle entry
> cpuidle: add a new predict interface
> tick/nohz: keep tick on for a fast idle
> timers: keep sleep length updated as needed
> cpuidle: make fast idle threshold tunable
> cpuidle: introduce irq timing to make idle prediction
> cpuidle: introduce run queue average idle to make idle prediction
>
> drivers/cpuidle/Kconfig | 1 +
> drivers/cpuidle/cpuidle.c | 109 +++++++++++++++++++++++++++++++++++++++
> drivers/cpuidle/governors/menu.c | 69 ++++++++++++++++---------
> include/linux/cpuidle.h | 21 ++++++++
> kernel/sched/idle.c | 14 ++++-
> kernel/sysctl.c | 12 +++++
> kernel/time/tick-sched.c | 7 +++
> 7 files changed, 209 insertions(+), 24 deletions(-)
>

Overall, it looks like you could avoid stopping the tick every time the
predicted idle duration is not longer than the tick interval in the first
place.

Why don't you do that?