Re: Plumbers: Tweaking scheduler policy micro-conf RFP
From: Luming Yu
Date: Sat May 19 2012 - 11:05:30 EST
On Tue, May 15, 2012 at 7:58 PM, Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:
> On Tue, 2012-05-15 at 14:35 +0300, Pantelis Antoniou wrote:
>> Power: Now that's a tricky one, we can't measure power directly, it's a
>> function of the cpu load we run in a period of time, along with any
>> history of the cstates & pstates of that period. How can we collect
>> information about that? Also we to take into account peripheral device
>> power to that; GPUs are particularly power hungry.
>
> Intel provides some measure of CPU power drain on recent chips (iirc),
> but yeah that doesn't include GPUs and other peripherals iirc.
>
>> Thermal management: How to distribute load to the processors in such
>> a way that the temperature of the die doesn't increase too much that
>> we have to either go to a lower OPP or shut down the core all-together.
>> This is in direct conflict with throughput since we'd have better performance
>> if we could keep the same warmed-up cpu going.
>
> Core-hopping.. yay! We have the whole sensors framework that provides an
> interface to such hardware, the question is, do chips have enough
> sensors spread on them to be useful?
>
>> Memory I/O: Some workloads are memory bandwidth hungry but do not need
>> much CPU power. In the case of asymmetric cores it would make sense to move
>> the memory bandwidth hog to a lower performance CPU without any impact.
>> Probably need to use some kind of performance counter for that; not going
>> to be very generic.
>
> You're assuming the slower cores have the same memory bandwidth, isn't
> that a dangerous assumption?
>
> Anyway, so the 'problem' with using PMCs from within the scheduler is
> that, 1) they're ass backwards slow on some chips (x86 anyone?) 2) some
> userspace gets 'upset' if they can't get at all of them.
>
> So it has to be optional at best, and I hate knobs :-) Also, the more
> information you're going to feed this load-balancer thing, the harder
> all that becomes, you don't want to do the full nm! m-dimensional bin
> fit.. :-)
>
Just curious if load-balance doesn't necessarily mean power/thermal balance,
or memory balance, then where to hack to satisfy such needs when it became
critical. e.g. People may want to have fine granularity usage plan to
control over
how processors gets used in day-to-day life. e.g. People may want to idle half
processors for few hours,while we want the request done in a way that it create
minimal impact on the quality of service provided by the system, which means:
1. we need to choose best cpus to idle. 2. soft-offline-cpu is not
right solution,
3. when really needed, get them back to service as fast as possible.
So my question is if any existing scheduler can help me do this?
Yes, I need knobs or interfaces :-) that could be used by another
driver which could be
thermal or power related. if there isn't, I would be very interested
to help create one.
thanks.
/l
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/