Re: [RFC PATCH] sched: find the latest idle cpu
From: Daniel Lezcano
Date: Thu Jan 16 2014 - 06:03:24 EST
On 01/15/2014 03:37 PM, Alex Shi wrote:
On 01/15/2014 03:35 PM, Peter Zijlstra wrote:
On Wed, Jan 15, 2014 at 12:07:59PM +0800, Alex Shi wrote:
Currently we just try to find least load cpu. If some cpus idled,
we just pick the first cpu in cpu mask.
In fact we can get the interrupted idle cpu or the latest idled cpu,
then we may get the benefit from both latency and power.
The selected cpu maybe not the best, since other cpu may be interrupted
during our selecting. But be captious costs too much.
No, we should not do anything like this without first integrating
At which point we have a sane view of the idle states and can make a
sane choice between them.
Any comments to make it better?
it is a nice optimization attempt but I agree with Peter we should focus
on integrating cpuidle.
The question is "how do we integrate cpuidle ?"
IMHO, the main problem are the governors, especially the menu governor.
The menu governor tries to predict the events per cpu. This approach
which gave us a nice benefit for the power saving may not fit well for
I think we can classify the events in three categories:
1. fully predictable (timers)
2. partially predictable (eg. MMC, sdd or network)
3. unpredictable (eg. keyboard, network ingress after quiescent period)
The menu governor mix 2 and 3 with statistics and a performance
multiplier to reach shallow states based on heuristic and
experimentation for a specific platform.
I was wondering if we shouldn't create a per task io latency tracking.
Mostly based on io_schedule and io_schedule_timeout, we track the
latency for each task for each device, keeping up to date a rb-tree
where the left-most leaf is the minimum latency for all the tasks
running on a specific cpu. That allows better tracking when moving tasks
With this approach, we have something consistent with the per load task
This io latency tracking gives us the next wake up event we can inject
to the cpuidle framework directly. That removes all the code related to
the menu governor statistics based on IO events and simplify a lot the
menu governor code. So we replaced a piece of the cpuidle code by a
scheduler code which I hope could be better for prediction, leading to a
part of integration.
In order to finish integrating the cpuidle framework in the scheduler,
there are pending questions about the impact in the current design.
Peter or Ingo, if you have time, could you have a look at the email I
sent previously  ?
<http://www.linaro.org/> Linaro.org â Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/