Re: [RFC PATCH v2 0/1] cpuidle: teo: Introduce optional util-awareness

From: Rafael J. Wysocki
Date: Wed Oct 12 2022 - 14:51:03 EST

On Mon, Oct 3, 2022 at 4:50 PM Kajetan Puchalski
<kajetan.puchalski@xxxxxxx> wrote:
> Hi,
> At the moment, all the available idle governors operate mainly based on their own past performance

Not true, at least for the menu and teo governors that use the
information on the distribution of CPU wakeups that is available to
them and try to predict the next idle duration with the help of it.
This has a little to do with their performance.

> without taking into account any scheduling information. Especially on interactive systems, this
> results in them frequently selecting a deeper idle state and then waking up before its target
> residency is hit, thus leading to increased wakeup latency and lower performance with no power
> saving. For 'menu' while web browsing on Android for instance, those types of wakeups ('too deep')
> account for over 24% of all wakeups.

How is this measured?

> At the same time, on some platforms C0 can be power efficient enough to warrant wanting to prefer
> it over C1.

Well, energy-efficiency is relative, so strictly speaking it is
invalid to say "power efficient enough".

Also, as far as idle CPUs are concerned, we are talking about the
situation in which no useful work is done at all, so the state drawing
less power is always more energy-efficient than the one drawing more

You may argue that predicting idle durations that are too long too
often leads to both excessive task wakeup latency and excessive energy
usage at the same time, but this may very well mean that the target
residency value for C1 is too low.

> Sleeps that happened in C0 while they could have used C1 ('too shallow') only save
> less power than they otherwise could have. Too deep sleeps, on the other hand, harm performance
> and nullify the potential power saving from using C1 in the first place. While taking this into
> account, it is clear that on balance it is preferable for an idle governor to have more too shallow
> sleeps instead of more too deep sleeps on those kinds of platforms.


> Currently the best available governor under this metric is TEO which on average results in less than
> half the percentage of too deep sleeps compared to 'menu', getting much better wakeup latencies and
> increased performance in the process.

Well, good to hear that, but some numbers in support of that claim
would be nice to have too.

> This proposed optional extension to TEO would specifically tune it for minimising too deep
> sleeps and minimising latency to achieve better performance. To this end, before selecting the next
> idle state it uses the avg_util signal of a CPU's runqueue in order to determine to what extent the
> CPU is being utilized.

Which has no bearing on what the CPU idle time governors have to do
which is (1) to predict the next idle duration as precisely as
reasonably possible and (2) to minimise the cost in terms of task
wakeup latencies associated with using deep idle states.

The avg_util value tells us nothing about how much the CPU is going to
be idle this time and it also tells us nothing about the
latency-sensitivity of the workload.

Yes, it tells us how much idle time there was on the given CPU in the
past, on the average, but there is zero information about the
distribution of that idle time in it.

So in the first place please tell me why it fundamentally makes sense
to use avg_util in CPU idle time management at all.