Re: [PATCH v2] cpuidle: Add 'above' and 'below' idle state metrics

From: Daniel Lezcano
Date: Thu Jan 10 2019 - 04:53:05 EST


On 10/12/2018 12:30, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>
> Add two new metrics for CPU idle states, "above" and "below", to count
> the number of times the given state had been asked for (or entered
> from the kernel's perspective), but the observed idle duration turned
> out to be too short or too long for it (respectively).
>
> These metrics help to estimate the quality of the CPU idle governor
> in use.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> ---
>
> -> v2: Fix a leftover in the documentation from the previous versions
> of the patch and a typo in the changelog.
>
> ---
> Documentation/ABI/testing/sysfs-devices-system-cpu | 7 ++++
> Documentation/admin-guide/pm/cpuidle.rst | 10 ++++++
> drivers/cpuidle/cpuidle.c | 31 ++++++++++++++++++++-
> drivers/cpuidle/sysfs.c | 6 ++++
> include/linux/cpuidle.h | 2 +
> 5 files changed, 55 insertions(+), 1 deletion(-)
>
> Index: linux-pm/drivers/cpuidle/cpuidle.c
> ===================================================================
> --- linux-pm.orig/drivers/cpuidle/cpuidle.c
> +++ linux-pm/drivers/cpuidle/cpuidle.c
> @@ -202,7 +202,6 @@ int cpuidle_enter_state(struct cpuidle_d
> struct cpuidle_state *target_state = &drv->states[index];
> bool broadcast = !!(target_state->flags & CPUIDLE_FLAG_TIMER_STOP);
> ktime_t time_start, time_end;
> - s64 diff;
>
> /*
> * Tell the time framework to switch to a broadcast timer because our
> @@ -248,6 +247,9 @@ int cpuidle_enter_state(struct cpuidle_d
> local_irq_enable();
>
> if (entered_state >= 0) {
> + s64 diff, delay = drv->states[entered_state].exit_latency;
> + int i;
> +
> /*
> * Update cpuidle counters
> * This can be moved to within driver enter routine,
> @@ -260,6 +262,33 @@ int cpuidle_enter_state(struct cpuidle_d
> dev->last_residency = (int)diff;

Shouldn't we subtract the 'delay' from the computed 'diff' in any case ?

Otherwise the 'last_residency' accumulates the effective sleep time and
the time to wakeup. We are interested in the sleep time only for
prediction and metrics no ?

> dev->states_usage[entered_state].time += dev->last_residency;
> dev->states_usage[entered_state].usage++;
> +
> + if (diff < drv->states[entered_state].target_residency) {
> + for (i = entered_state - 1; i >= 0; i--) {
> + if (drv->states[i].disabled ||
> + dev->states_usage[i].disable)
> + continue;
> +
> + /* Shallower states are enabled, so update. */
> + dev->states_usage[entered_state].above++;
> + break;
> + }
> + } else if (diff > delay) {
> + for (i = entered_state + 1; i < drv->state_count; i++) {
> + if (drv->states[i].disabled ||
> + dev->states_usage[i].disable)
> + continue;
> +
> + /*
> + * Update if a deeper state would have been a
> + * better match for the observed idle duration.
> + */
> + if (diff - delay >= drv->states[i].target_residency)
> + dev->states_usage[entered_state].below++;
> +
> + break;
> + }
> + }
> } else {


--
<http://www.linaro.org/> Linaro.org â Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog