Re: [PATCH v1] cpuidle: teo: Update documentation after previous changes

From: Rafael J. Wysocki
Date: Fri Jan 10 2025 - 09:53:13 EST


On Fri, Jan 10, 2025 at 3:09 PM Christian Loehle
<christian.loehle@xxxxxxx> wrote:
>
> On 1/10/25 12:48, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >
> > After previous changes, the description of the teo governor in the
> > documentation comment does not match the code any more, so update it
> > as appropriate.
> >
> > Fixes: 449914398083 ("cpuidle: teo: Remove recent intercepts metric")
> > Fixes: 2662342079f5 ("cpuidle: teo: Gather statistics regarding whether or not to stop the tick")
> > Fixes: 6da8f9ba5a87 ("cpuidle: teo: Skip tick_nohz_get_sleep_length() call in some cases")
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > ---
> > drivers/cpuidle/governors/teo.c | 99 +++++++++++++++++++++-------------------
> > 1 file changed, 52 insertions(+), 47 deletions(-)
> >
> > --- a/drivers/cpuidle/governors/teo.c
> > +++ b/drivers/cpuidle/governors/teo.c
> > @@ -10,25 +10,27 @@
> > * DOC: teo-description
> > *
> > * The idea of this governor is based on the observation that on many systems
> > - * timer events are two or more orders of magnitude more frequent than any
> > - * other interrupts, so they are likely to be the most significant cause of CPU
> > - * wakeups from idle states. Moreover, information about what happened in the
> > - * (relatively recent) past can be used to estimate whether or not the deepest
> > - * idle state with target residency within the (known) time till the closest
> > - * timer event, referred to as the sleep length, is likely to be suitable for
> > - * the upcoming CPU idle period and, if not, then which of the shallower idle
> > - * states to choose instead of it.
> > - *
> > - * Of course, non-timer wakeup sources are more important in some use cases
> > - * which can be covered by taking a few most recent idle time intervals of the
> > - * CPU into account. However, even in that context it is not necessary to
> > - * consider idle duration values greater than the sleep length, because the
> > - * closest timer will ultimately wake up the CPU anyway unless it is woken up
> > - * earlier.
> > - *
> > - * Thus this governor estimates whether or not the prospective idle duration of
> > - * a CPU is likely to be significantly shorter than the sleep length and selects
> > - * an idle state for it accordingly.
> > + * timer interrupts are two or more orders of magnitude more frequent than any
> > + * other interrupt types, so they are likely to dominate CPU wakeup patterns.
> > + * Moreover, in principle, the time when the next timer event is going to occur
> > + * can be determined at the idle state selection time, although doing that may
> > + * be costly, so it can be regarded as the most reliable source of information
> > + * for idle state selection.
> > + *
> > + * Of course, non-timer wakeup sources are more important in some use cases,
> > + * but even then it is generally unnecessary to consider idle duration values
> > + * greater than the time time till the next timer event, referred as the sleep
> > + * length in what follows, because the closest timer will ultimately wake up the
> > + * CPU anyway unless it is woken up earlier.
> > + *
> > + * However, since obtaining the sleep length may be costly, the governor first
> > + * checks if it can select a shallow idle state using wakeup pattern information
> > + * from recent times, in which case it can do without knowing the sleep length
> > + * at all. For this purpose, it counts CPU wakeup events and looks for an idle
> > + * state whose terget residency has not exceeded the idle duration (measured
>
> s/terget/target
>
> > + * after wakeup) in the majority of relevant recent cases. If the target
> > + * residency of that state is small enough, it may be used right away and the
> > + * sleep length need not be determined.
> > *
> > * The computations carried out by this governor are based on using bins whose
> > * boundaries are aligned with the target residency parameter values of the CPU
> > @@ -39,7 +41,11 @@
> > * idle state 2, the third bin spans from the target residency of idle state 2
> > * up to, but not including, the target residency of idle state 3 and so on.
> > * The last bin spans from the target residency of the deepest idle state
> > - * supplied by the driver to infinity.
> > + * supplied by the driver to the scheduler tick period length or to infinity if
> > + * the tick period length is less than the targer residency of that state. In
>
> s/targer/target

Will fix it when applying.

> > + * the latter case, the governor also counts events with the measured idle
> > + * duration between the tick period length and the target residency of the
> > + * deepest idle state.
> > *
> > * Two metrics called "hits" and "intercepts" are associated with each bin.
> > * They are updated every time before selecting an idle state for the given CPU
> > @@ -49,47 +55,46 @@
> > * sleep length and the idle duration measured after CPU wakeup fall into the
> > * same bin (that is, the CPU appears to wake up "on time" relative to the sleep
> > * length). In turn, the "intercepts" metric reflects the relative frequency of
> > - * situations in which the measured idle duration is so much shorter than the
> > - * sleep length that the bin it falls into corresponds to an idle state
> > - * shallower than the one whose bin is fallen into by the sleep length (these
> > - * situations are referred to as "intercepts" below).
> > + * non-timer wakeup events for which the measured idle duration falls into a bin
> > + * that corresponds to an idle state shallower than the one whose bin is fallen
> > + * into by the sleep length (these events are also referred to as "intercepts"
> > + * below).
> > *
> > * In order to select an idle state for a CPU, the governor takes the following
> > * steps (modulo the possible latency constraint that must be taken into account
> > * too):
> > *
> > - * 1. Find the deepest CPU idle state whose target residency does not exceed
> > - * the current sleep length (the candidate idle state) and compute 2 sums as
> > - * follows:
> > - *
> > - * - The sum of the "hits" and "intercepts" metrics for the candidate state
> > - * and all of the deeper idle states (it represents the cases in which the
> > - * CPU was idle long enough to avoid being intercepted if the sleep length
> > - * had been equal to the current one).
> > - *
> > - * - The sum of the "intercepts" metrics for all of the idle states shallower
> > - * than the candidate one (it represents the cases in which the CPU was not
> > - * idle long enough to avoid being intercepted if the sleep length had been
> > - * equal to the current one).
> > + * 1. Find the deepest enabled CPU idle state (the candidate idle state) and
> > + * compute 2 sums as follows:
> > *
> > - * 2. If the second sum is greater than the first one the CPU is likely to wake
> > - * up early, so look for an alternative idle state to select.
> > + * - The sum of the "hits" metric for all of the idle states shallower than
> > + * the candidate one (it represents the cases in which the CPU was likely
> > + * woken up by a timer).
> > + *
> > + * - The sum of the "intercepts" metric for all of the idle states shallower
> > + * than the candidate one (it represents the cases in which the CPU was
> > + * likely woken up by a non-timer wakeup source).
> > + *
> > + * 2. If the second sum computed in step 1 is greater than a half of the sum of
> > + * both mertics for the candidate state bin and all subsequent bins(if any),
>
> s/mertics/metrics

Ditto.

> > + * a shallower idle state is likely to be more suitable, so look for it.
> > *
> > - * - Traverse the idle states shallower than the candidate one in the
> > + * - Traverse the enabled idle states shallower than the candidate one in the
> > * descending order.
> > *
> > * - For each of them compute the sum of the "intercepts" metrics over all
> > * of the idle states between it and the candidate one (including the
> > * former and excluding the latter).
> > *
> > - * - If each of these sums that needs to be taken into account (because the
> > - * check related to it has indicated that the CPU is likely to wake up
> > - * early) is greater than a half of the corresponding sum computed in step
> > - * 1 (which means that the target residency of the state in question had
> > - * not exceeded the idle duration in over a half of the relevant cases),
> > - * select the given idle state instead of the candidate one.
> > + * - If this sum is greater than a half of the second sum computed in step 1,
> > + * use the given idle state as the new candidate one.
> > *
> > - * 3. By default, select the candidate state.
> > + * 3. If the current candidate state is state 0 or its target residency is short
> > + * enough, return it and prevent the scheduler tick from being stopped.
> > + *
> > + * 4. Obtain the sleep length value and check if it is below the target
> > + * residency of the current candidate state, in which case a new shallower
> > + * candidate state needs to be found, so look for it.
> > */
>
> Description seems to parse in my brain FWIW.
> Thanks for cleaning that up, clearly I've overlooked that doc.
>
> Reviewed-by: Christian Loehle <christian.loehle@xxxxxxx>

Thanks!