Re: [RFC PATCH v2 2/8] cpuidle: record the overhead of idle entry

From: Li, Aubrey
Date: Sun Oct 15 2017 - 23:12:07 EST


On 2017/10/14 8:35, Rafael J. Wysocki wrote:
> On Saturday, September 30, 2017 9:20:28 AM CEST Aubrey Li wrote:
>> Record the overhead of idle entry in micro-second
>>
>
> What is this needed for?

We need to figure out how long of a idle is a short idle and recording
the overhead is for this purpose. The short idle threshold is based
on this overhead.

>
>> +void cpuidle_entry_end(void)
>> +{
>> + struct cpuidle_device *dev = cpuidle_get_device();
>> + u64 overhead;
>> + s64 diff;
>> +
>> + if (dev) {
>> + dev->idle_stat.entry_end = local_clock();
>> + overhead = div_u64(dev->idle_stat.entry_end -
>> + dev->idle_stat.entry_start, NSEC_PER_USEC);
>
> Is the conversion really necessary?
>
> If so, then why?

We can choose nano-second and micro-second. Given that workload results
in the short idle pattern, I think micro-second is good enough for the
real workload.

Another reason is that prediction from idle governor is micro-second, so
I convert it for comparing purpose.
>
> And if there is a good reason, what about using right shift to do
> an approximate conversion to avoid the extra division here?

Sure >> 10 works for me as I don't think here precision is a big deal.

>
>> + diff = overhead - dev->idle_stat.overhead;
>> + dev->idle_stat.overhead += diff >> 3;
>
> Can you please explain what happens in the two lines above?

Online average computing algorithm, stolen from update_avg() @ kernel/sched/core.c.
>
>> + /*
>> + * limit overhead to 1us
>> + */
>> + if (dev->idle_stat.overhead == 0)
>> + dev->idle_stat.overhead = 1;
>> + }
>> +}
>> +
>> /**
>> * cpuidle_install_idle_handler - installs the cpuidle idle loop handler
>> */
>> diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
>> index fc1e5d7..cad9b71 100644
>> --- a/include/linux/cpuidle.h
>> +++ b/include/linux/cpuidle.h
>> @@ -72,6 +72,15 @@ struct cpuidle_device_kobj;
>> struct cpuidle_state_kobj;
>> struct cpuidle_driver_kobj;
>>
>> +struct cpuidle_stat {
>> + u64 entry_start; /* nanosecond */
>> + u64 entry_end; /* nanosecond */
>> + u64 overhead; /* nanosecond */
>> + unsigned int predicted_us; /* microsecond */
>> + bool predicted; /* ever predicted? */
>> + bool fast_idle; /* fast idle? */
>
> Some of the fields here are never used in the code after this patch.
>
> Also it would be good to add a comment describing the meaning of the
> fields.
>
okay, will add in the next version.

Thanks,
-Aubrey