Re: [PATCH 2/2] sched: cpufreq: use rt_avg as estimate of required RT CPU capacity
From: Thomas Gleixner
Date: Fri Sep 02 2016 - 04:15:07 EST
On Wed, 31 Aug 2016, Peter Zijlstra wrote:
> On Wed, Aug 31, 2016 at 06:28:10PM +0200, Thomas Gleixner wrote:
> > > That is the way it's been with cpufreq and many systems (including all
> > > mobile devices) rely on that to not destroy power. RT + variable cpufreq
> > > is not deterministic.
> > >
> > > Given we don't have good constraints on RT tasks I don't think we should
> > > try to strengthen the semantics there. Folks should either move to DL if
> > > they want determinism *and* not-sucky power, or continue disabling
> > > cpufreq if they are able to do so.
> >
> > RT deterministic behaviour is all about meeting the deadlines. If your
> > deadline is relaxed enough that you can meet it even with the lowest cpu
> > frequency then it's perfectly fine to enable cpufreq. The same logic applies
> > to C-States.
> >
> > There are a lot of RT systems out there which enable both. If cpufreq or
> > c-states cause a deadline violation because the constraints of the system are
> > tight, then people will disable it and we need a knob for both.
> >
> > Realtime is not as fast as possible. It's as fast as specified.
>
> Sure, problem is of course that RR/FIFO doesn't specify anything so the
> users are left to prod knobs.
I know :(
> Another problem is that we have many semi related knobs; we have the
> global RT runtime limit knob, but that doesn't affect cpufreq (maybe it
> should) and cpufreq has knobs to set f_min and f_max, which again are
> unaware of RT anything.
>
> So before we go do anything, I'd like input on what is needed and how
> things should tie together to make most sense.
RT systems and especially RR/FIFO driven ones need a lot of specific tuning
and configuration. I doubt that we can do anything except lousy heuristics
which will end up being wrong for most use cases.
In the DL case we certainly can do informed decisions, but for the RR/FIFO
case the global RT runtime limit is just a too big hammer which shouldn't be
abused for calculating cpufreq limits.
I think that we should concentrate on DL and make it work very well and just
leave the rest of the RT folks with rather simplistic knobs (i.e. on/off/hard
limits). That will force people who have RT _and_ power constraints to think
harder about their system design and eventually make them move over to DL.
Thanks,
tglx