Re: [PATCH 4/8] cpufreq/schedutil: sysfs capacity margin tunable

From: Peter Zijlstra
Date: Tue Mar 15 2016 - 17:49:45 EST


On Tue, Mar 15, 2016 at 02:40:43PM -0700, Michael Turquette wrote:
> Quoting Peter Zijlstra (2016-03-15 14:20:47)
> > On Sun, Mar 13, 2016 at 10:22:08PM -0700, Michael Turquette wrote:
> > > With the addition of the global cfs capacity margin helpers in patch,
> > > "sched/cpufreq: new cfs capacity margin helpers", we can now export
> > > sysfs tunables from the schedutil governor. This allows privileged users
> > > to tune the value more easily.
> > >
> > > The margin value is global to cfs, not per-policy. As such schedutil
> > > does not store any state about the margin. Schedutil restores the margin
> > > value to its default value when exiting.
> >
> > Yuck sysfs.. I would really rather we did not expose this per default.
> > And certainly not in this weird form.
>
> I'm happy to change capacity_margin to up_threshold and use a
> percentage.
>
> The sysfs approach has two benefits. First, it is aligned with cpufreq
> user expectations. Second, there has been rough consensus that this
> value should be tunable and sysfs gets us there quickly and painlessly.
> We're already exporting rate_limit_us for schedutil via sysfs. Is there
> a better way interface you can recommend?

It really depends on how tunable you want this to be. Do we always want
this to be a tunable, or just now while we're playing about with the
whole thing?

The problem with exposing it in sysfs is that you cannot take it out
again, it becomes ABI.

What we do for all the scheduler tunables (pretty much every time we
have to take a value out of thin air), is we make them const for
!SCHED_DEBUG builds, but have them as sysctl for SCHED_DEBUG builds
(although we should probably move them to /debug/sched/ or somesuch).

That way you get better code generation (compile time constants rule)
for !debug builds, while having the 'joy' of poking at your number on
debug builds.