Re: [RFC PATCH V2 01/19] sched/power: Remove cpu idle state selection and cpu frequency tuning

From: Nicolas Pitre
Date: Mon Aug 18 2014 - 13:53:53 EST


On Mon, 18 Aug 2014, Preeti U Murthy wrote:

> On 08/18/2014 09:09 PM, Nicolas Pitre wrote:
> > On Mon, 11 Aug 2014, Preeti U Murthy wrote:
> >
> >> As a first step towards improving the power awareness of the scheduler,
> >> this patch enables a "dumb" state where all power management is turned off.
> >> Whatever additionally we put into the kernel for cpu power management must
> >> do better than this in terms of performance as well as powersavings.
> >> This will enable us to benchmark and optimize the power aware scheduler
> >> from scratch.If we are to benchmark it against the performance of the
> >> existing design, we will get sufficiently distracted by the performance
> >> numbers and get steered away from a sane design.
> >
> > I understand your goal here, but people *will* compare performance
> > between the old and the new design anyway. So I think it would be a
> > better approach to simply let the existing code be and create a new
> > scheduler-based governor that can be swapped with the existing ones at
> > run time. Eventually we'll want average users to test and compare this,
> > and asking them to recompile a second kernel and reboot between them
> > might get unwieldy to many people.
> >
> > And by allowing both to coexist at run time, we're making sure both the
> > old and the new code are built helping not breaking the old code. And
> > that will also cut down on the number of #ifdefs in many places.
> >
> > In other words, CONFIG_SCHED_POWER is needed to select the scheduler
> > based governor but it shouldn't force the existing code disabled.
>
> I don't think I understand you here. So are you proposing a runtime
> switch like a sysfs interface instead of a config switch?

Absolutely.

And looking at drivers/cpuidle/sysfs.c:store_current_governor() it seems
that the facility is there already.

> Wouldn't that be unwise given that its a complete turnaround of the
> behavior kernel after the switch?

Oh sure. This is like changing cpufreq governors at run time. But
people should know what they're playing with and that system behavior
changes are expected.

> I agree that the first patch is a dummy patch, its
> meant to ensure that we have *atleast* the power efficiency that this
> patch brings in. Of course after that point this patch is a no-op. In
> fact the subsequent patches will mitigate the effect of this.

Still, allowing runtime switches between the legacy governors and
the in-scheduler governor will greatly facilitate benchmarking.

And since our goal is to surpass the legacy governors then we should set
it as our reference mark from the start.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/