Re: Plumbers: Tweaking scheduler policy micro-conf RFP
From: Pantelis Antoniou
Date: Tue May 15 2012 - 07:36:01 EST
On May 15, 2012, at 1:28 PM, Peter Zijlstra wrote:
> On Tue, 2012-05-15 at 12:17 +0300, Pantelis Antoniou wrote:
>> IMO this whole idea about 'server' or 'desktop' schedulers is bunk.
>
> Yeah, its complete shite. Everybody cares about throughput, latency and
> power. The exact balance might differ between workloads but those cannot
> be split between desktop/server at all. Furthermore, nobody wants one at
> all costs to the others.
Expanding on this a little more, the balancing between the factors might
vary according to the workload you run at the time and not on a pre-set
scenario at the time.
For example take a server configuration, one would expect it to be geared
towards throughput with no regard to power or latency. This is not the
case today. Power saving can be considerable, and low latency might be
very desirable if you run on it stuff like a VoIP based soft PBX.
Same with a desktop, running ooffice and a browser most of the time, but
you would expect to run a game or an audio editing/performing application.
The smart-phone case is like juggling coals; you need to have the minimum
amount of power draw, but you better offer minimum latency and high
throughtput when pissed-off-avians is on.
Now the question is how to fit this in a scheduler policy.
We have the 3 ones that Peter mentioned;
Throughput
Latency
Power
I can think of two more; thermal management, and memory I/O.
What other can we come up with? And what are the units that we are going
to measure them with?
For example:
Throughput: MIPS(?), bogo-mips(?), some kind of performance counter?
Latency: usecs(?)
Power: Now that's a tricky one, we can't measure power directly, it's a
function of the cpu load we run in a period of time, along with any
history of the cstates & pstates of that period. How can we collect
information about that? Also we to take into account peripheral device
power to that; GPUs are particularly power hungry.
Thermal management: How to distribute load to the processors in such
a way that the temperature of the die doesn't increase too much that
we have to either go to a lower OPP or shut down the core all-together.
This is in direct conflict with throughput since we'd have better performance
if we could keep the same warmed-up cpu going.
Memory I/O: Some workloads are memory bandwidth hungry but do not need
much CPU power. In the case of asymmetric cores it would make sense to move
the memory bandwidth hog to a lower performance CPU without any impact.
Probably need to use some kind of performance counter for that; not going
to be very generic.
Any more ideas?
Regards
-- Pantelis
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/