Re: [RFC] Documentation/scheduler/schedutil.txt
From: Mel Gorman
Date: Wed Dec 02 2020 - 11:46:32 EST
On Wed, Dec 02, 2020 at 04:54:52PM +0100, Peter Zijlstra wrote:
> > IIRC, this 32ms is tied to the value of LOAD_AVG_PERIOD and the length
> > of the ewma_sum series below. Might be worth expanding a little further.
>
> It is LOAD_AVG_PERIOD. Some people (re)generate the PELT tables with a
> different period (16 and 64 are common).
>
> Not sure what there is to expand; the whole of it is: y^32=0.5. We had
> to pick some half-life period, 32 seemed like a good number.
>
No issue with the number other than the y^32 is tied to LOAD_AVG_PERIOD.
Again, it's something that someone looking at the source would eventually
figure out so it's probably for the best.
> > > To alleviate this (a default enabled option) UTIL_EST drives an (IIR) EWMA
> >
> > Expand IIR -- Immediate Impulse Reponse?
>
> Infinite Impuse Response
>
Sorry, yes, still worth an expansion.
> > > with the 'running' value on dequeue -- when it is highest. A further default
> > > enabled option UTIL_EST_FASTUP modifies the IIR filter to instantly increase
> > > and only decay on decrease.
> > >
> > > A further runqueue wide sum (of runnable tasks) is maintained of:
> > >
> > > util_est := \Sum_t max( t_running, t_util_est_ewma )
> > >
> > > For more detail see: kernel/sched/fair.h:util_est_dequeue()
> > >
> >
> > It's less obvious what the consequence is unless the reader manages to
> > tie the IO-wait comment in "Schedutil / DVFS" to this section.
>
> I'm not entirely sure I follow. The purpose of UTIL_EST is to avoid
> ramp-up issues and isn't related to IO-wait boosting.
>
I mixed up the example. Historically io-wait boosting was one way of
avoiding DVFS ramp-up issues but now that I reread it, it's best to leave
it general like you already have in your current version.
> > Is it worth explicitly mentioning that a key advantage over
> > hardware-based approaches is that schedutil carries utilisation state on
> > CPU migration? You say that it is tracked but it's less obvious why that
> > matters as a pure hardware based approach loses utilisation information
> > about a task once it migrates.
>
> Not sure that was the exact goal of the document; I set out to describe
> schedutil.
>
Fair enough, it would simply lead to documentation creep.
> > Even moving note 3 below into this section and expanding it with an
> > example based on HWP would be helpful.
>
> I might not be the best person to talk about HWP; even though I work for
> Intel I know remarkably little of it. I don't even think I've got a
> machine that has it on.
>
> Latest version below... I'll probably send it as a patch soon and get it
> merged. We can always muck with it more later.
>
True. At least any confusion can then be driven by specific questions :)
FWIW, after reading the new version I'll ack the patch when it shows up.
Thanks!
--
Mel Gorman
SUSE Labs