Re: [RFC][PATCH 0/9] sched: Power scheduler design proposal

From: Preeti U Murthy
Date: Thu Jul 11 2013 - 07:37:28 EST


Hi Morten,

I have a few quick comments.

On 07/09/2013 10:28 PM, Arjan van de Ven wrote:
> On 7/9/2013 8:55 AM, Morten Rasmussen wrote:
>> Hi,
>>
>> This patch set is an initial prototype aiming at the overall power-aware
>> scheduler design proposal that I previously described
>> <http://permalink.gmane.org/gmane.linux.kernel/1508480>.
>>
>> The patch set introduces a cpu capacity managing 'power scheduler'
>> which lives
>> by the side of the existing (process) scheduler. Its role is to
>> monitor the
>> system load and decide which cpus that should be available to the process
>> scheduler. Long term the power scheduler is intended to replace the
>> currently
>> distributed uncoordinated power management policies and will interface a
>> unified platform specific power driver obtain power topology
>> information and
>> handle idle and P-states. The power driver interface should be made
>> flexible
>> enough to support multiple platforms including Intel and ARM.
>>
> I quickly browsed through it but have a hard time seeing what the
> real interface is between the scheduler and the hardware driver.
> What information does the scheduler give the hardware driver exactly?
> e.g. what does it mean?
>
> If the interface is "go faster please" or "we need you to be at fastest
> now",
> that doesn't sound too bad.
> But if the interface is "you should be at THIS number" that is pretty
> bad and
> not going to work for us.
>
> also, it almost looks like there is a fundamental assumption in the code
> that you can get the current effective P state to make scheduler
> decisions on;
> on Intel at least that is basically impossible... and getting more so
> with every generation
> (likewise for AMD afaics)

I am concerned too about scheduler making its load balancing decisions
based on the cpu frequency for the reason that it could create an
imbalance in the load across cpus.

Scheduler could keep loading a cpu, because its cpu frequency goes on
increasing, and it could keep un-loading a cpu because its cpu frequency
goes on decreasing. This increase and decrease as an effect of the load
itself. This is of course assuming that the driver would make its
decisions proportional to the cpu load. There could be many more
complications, if the driver makes its decisions on factors unknown to
the scheduler.

Therefore my suggestion is that we should simply have the scheduler
asking for increase/decrease in the frequency and letting it at that.

Secondly, I think we should spend more time on when to make a call to
the frequency driver in your patchset regarding the change in the
frequency of the CPU, the scheduler wishes to request. The reason being,
the whole effort of integrating the knowledge of cpu frequency
statistics into the scheduler is being done because the scheduler can
call the frequency driver at times *complimenting* load balancing,
unlike now.

Also adding Rafael to the cc list.

Regards
Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/