Re: [RFC][PATCH 0/9] sched: Power scheduler design proposal

From: Arjan van de Ven
Date: Mon Jul 15 2013 - 11:25:39 EST


On 7/15/2013 2:55 AM, Catalin Marinas wrote:
In terms of how it boosts the performance, a suggestion was to keep the
power scheduler relatively simple with an API to a new model of power
driver and have the actual scaling algorithm (governor) as library used
by the low-level driver. We can keep the API simple like
get_max_performance() etc. but the driver has the potential to choose
what is best suited for the hardware.

I like simple ;-)
I like descriptive and intent-driven as well (rather than prescriptive) for high level concepts.
and I like libraries of functionality you can pull from.

one thing we're skirting around in this whole discussion is the concept of performance sensitivity.
or to phrase it in the form of a question "Is more performance desired to have right now?"
Some of these answers certainly can come from the scheduler, at certain specific cases
it will know that the answer is "yes" to that question. An oversubscribed runqueue
is certainly such a case. Scheduling a realtime/highpriority/whatever task.. the scheduler
knows more than anyone else about that.
There are other cases elsewhere in the kernel (the graphics driver may have ideas if it just missed a frame
for example).
Very high interrupt rates are another clear case of such sensitivity.

(and I'm quite fine presuming a "no unless" policy for the question)

what is hard for the scheduler is that by the time the scheduler realizes it's in a hole,
it may already be too late. Yes P states change relatively quickly... and it is certainly
worth saying "I'm in the hole, go faster!".
But seeing the impact of the "go faster" on the RQ will take time, e.g. only some time later
(say 10 to 100 msec) is the scheduler able to evaluate if the change helped enough.
It's tempting to just wait.. but maybe the right answer is to do two things: Load balance right now,
AND boost the P state of the cpus that run the load after the balance. And then 10 to 100 msec later,
evaluate if they can be balanced/consolidated back.
E.g. jump out of the whole instantly, and then look later if the hole is filled enough to jump back into later ;-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/