Re: [RFC PATCH 3/3] idle: store the idle state index in the structrq

From: Preeti U Murthy
Date: Mon Feb 03 2014 - 01:37:17 EST


Hi Daniel,

On 01/31/2014 03:45 PM, Daniel Lezcano wrote:
> On 01/31/2014 09:45 AM, Preeti Murthy wrote:
>> Hi,
>>
>> On Thu, Jan 30, 2014 at 10:55 PM, Daniel Lezcano
>> <daniel.lezcano@xxxxxxxxxx> wrote:
>>> On 01/30/2014 05:35 PM, Peter Zijlstra wrote:
>>>>
>>>> On Thu, Jan 30, 2014 at 05:27:54PM +0100, Daniel Lezcano wrote:
>>>>>
>>>>> struct cpuidle_state *state = &drv->states[rq->index];
>>>>>
>>>>> And from the state, we have the following informations:
>>>>>
>>>>> struct cpuidle_state {
>>>>>
>>>>> [ ... ]
>>>>>
>>>>> unsigned int exit_latency; /* in US */
>>>>> int power_usage; /* in mW */
>>>>> unsigned int target_residency; /* in US */
>>>>> bool disabled; /* disabled on all CPUs */
>>>>>
>>>>> [ ... ]
>>>>> };
>>>>
>>>>
>>>> Right, but can we say that a higher index will save more power and have
>>>> a higher exit latency? Or is a driver free to have a random mapping
>>>> from
>>>> idle_index to state?
>>>
>>>
>>> If the driver does its own random mapping that will break the governor
>>> logic. So yes, the states are ordered, the higher the index is, the
>>> more you
>>> save power and the higher the exit latency is.
>>
>> The above point holds true for only the ladder governor which sees the
>> idle
>> states indexed in the increasing order of target_residency/exit_latency.
>
> The cpuidle framework has been modified for both governor, see commit
> 8aef33a7.
>
> The power field was initially used to do the selection, but no power
> value was ever used to filled this field by any hardware. So the field
> was arbitrarily filled with a decreasing value (-1, -2, -3 ...), and
> used by the governor's select function. The patch above just removed
> this field and the condition on power for 'select' assuming the idle
> state are power ordered in the array.

Ok. Looking at commit id 71abbbf856a0, it looks like the primary
motivation for it was the power_usage numbers of each idle state. But if
that went unused, then it perhaps makes sense to revert that patch.

Commit 8aef33a7 pretty much did that. However I think it overlooked the
menu_select() function where the the search iterates through all the
idle states introduced by the above mentioned commit again. Since its
purpose is outdated as per what you say, its best if we correct this now
as per the below post that you have pointed to.

[RFC PATCH] cpuidle: reduce unnecessary loop in c-state selection
>
>> However this is not true as far as I can see in the menu governor. It
>> acknowledges the dynamic ordering of idle states as can be seen in the
>> menu_select() function in the menu governor, where the idle state for the
>> CPU gets chosen. You will notice that, even if it is found that the
>> predicted
>> idle time of the CPU is smaller than the target residency of an idle
>> state,
>> the governor continues to search for suitable idle states in the
>> higher indexed
>> states although it should have halted if the idle states' were ordered
>> according
>> to their target residency.. The same holds for exit_latency.
>
> I am not sure to get the point. Actually, this loop should be just
> optimized to backward search the idle state like cpuidle_play_dead does
>
> There is also a patch proposed by Alex Shi about this loop.
>
> [RFC PATCH] cpuidle: reduce unnecessary loop in c-state selection
>
> http://comments.gmane.org/gmane.linux.power-management.general/42124

But again if we are copying the exit_latency and target_residency
numbers of the idle state entered, into the rq as soon as the idle state
for the CPU is chosen, as per the discussion on this thread, then I
guess the ordering of the idle states in the cpuidle state table does
not matter.

Thanks

Regards
Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/