Re: [RFCv3 PATCH 33/48] sched: Energy-aware wake-up task placement

From: Peter Zijlstra
Date: Tue Apr 28 2015 - 09:06:51 EST

On Mon, Apr 27, 2015 at 09:01:13AM -0700, Michael Turquette wrote:
> Quoting Peter Zijlstra (2015-03-26 03:41:50)
> > On Thu, Mar 26, 2015 at 10:21:24AM +0000, Juri Lelli wrote:
> > > - what about other sched classes? I know that this is very premature,
> > > but I can help but thinking that we'll need to do some sort of
> > > aggregation of requests, and if we put triggers in very specialized
> > > points we might lose some of the sched classes separation
> >
> > So for deadline we can do P state selection (as you're well aware) based
> > on the requested utilization. Not sure what to do for fifo/rr though,
> > they lack much useful information (as always).
> >
> > Now if we also look ahead to things like the ACPI CPPC stuff we'll see
> > that CFS and DL place different requirements on the hints. Where CFS
> > would like to hint a max perf (the hardware going slower due to the code
> > consisting of mostly stalls is always fine from a best effort energy
> > pov), the DL stuff would like to hint a min perf, seeing how it 'needs'
> > to provide a QoS.
> >
> > So we either need to carry this information along in a 'generic' way
> > between the various classes or put the hinting in every class.
> >
> > But yes, food for thought for sure.
> I am a fan of putting the hints in every class. One idea I've been
> considering is that each sched class could have a small, simple cpufreq
> governor that expresses its constraints (max for cfs, min qos for dl)
> and then the cpufreq core Does The Right Thing.
> This would be a multi-governor approach, which requires some surgery to
> cpufreq core code, but I like the modularity and maintainability of it
> more than having one big super governor that has to satisfy every need.

Well, at that point we really don't need cpufreq anymore do we? All
you need is the hardware driver (ACPI P-state, ACPI CPPC etc.).

Because as I understand it, cpufreq currently is mostly the governor
thing (which we'll replace) and some infra for dealing with these head
cases that require scheduling for changing P states (which we can leave
on cpufreq proper for the time being).

Would it no be easier to just start from scratch and convert the (few)
drivers we need to prototype this? Instead of trying to drag the
entirety of cpufreq along just to keep all the drivers?
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at