Re: [PATCH v4 2/2] Documentation/sysctl: Document uclamp sysctl knobs

From: Qais Yousef
Date: Tue May 05 2020 - 10:56:46 EST


Hi Patrick

On 05/03/20 19:45, Patrick Bellasi wrote:
> > +sched_util_clamp_min:
> > +=====================
> > +
> > +Max allowed *minimum* utilization.
> > +
> > +Default value is SCHED_CAPACITY_SCALE (1024), which is the maximum possible
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Mmm... I feel one of the two is an implementation detail which should
> probably not be exposed?
>
> The user perhaps needs to know the value (1024) but we don't need to
> expose the internal representation.

Okay.

>
>
> > +value.
> > +
> > +It means that any requested uclamp.min value cannot be greater than
> > +sched_util_clamp_min, i.e., it is restricted to the range
> > +[0:sched_util_clamp_min].
> > +
> > +sched_util_clamp_max:
> > +=====================
> > +
> > +Max allowed *maximum* utilization.
> > +
> > +Default value is SCHED_CAPACITY_SCALE (1024), which is the maximum possible
> > +value.
> > +
> > +It means that any requested uclamp.max value cannot be greater than
> > +sched_util_clamp_max, i.e., it is restricted to the range
> > +[0:sched_util_clamp_max].
> > +
> > +sched_util_clamp_min_rt_default:
> > +================================
> > +
> > +By default Linux is tuned for performance. Which means that RT tasks always run
> > +at the highest frequency and most capable (highest capacity) CPU (in
> > +heterogeneous systems).
> > +
> > +Uclamp achieves this by setting the requested uclamp.min of all RT tasks to
> > +SCHED_CAPACITY_SCALE (1024) by default, which effectively boosts the tasks to
> > +run at the highest frequency and biases them to run on the biggest CPU.
> > +
> > +This knob allows admins to change the default behavior when uclamp is being
> > +used. In battery powered devices particularly, running at the maximum
> > +capacity and frequency will increase energy consumption and shorten the battery
> > +life.
> > +
> > +This knob is only effective for RT tasks which the user hasn't modified their
> > +requested uclamp.min value via sched_setattr() syscall.
> > +
> > +This knob will not escape the constraint imposed by sched_util_clamp_min
> > +defined above.
>
> Perhaps it's worth to specify that this value is going to be clamped by
> the values above? Otherwise it's a bit ambiguous to know what happen
> when it's bigger than schedu_util_clamp_min.

Hmm for me that sentence says exactly what you're asking for.

So what you want is

s/will not escape the constraint imposed by/will be clamped by/

?

I'm not sure if this will help if the above is already ambiguous. Maybe if
I explicitly say

..will not escape the *range* constrained imposed by..

sched_util_clamp_min is already defined as a range constraint, so hopefully it
should hit the mark better now?

>
> > +Any modification is applied lazily on the next opportunity the scheduler needs
> > +to calculate the effective value of uclamp.min of the task.
> ^^^^^^^^^
>
> This is also an implementation detail, I would remove it.

The idea is that this value is not updated 'immediately'/synchronously. So
currently RUNNING tasks will not see the effect, which could generate confusion
when users trip over it. IMO giving an idea of how it's updated will help with
expectation of the users. I doubt any will care, but I think it's an important
behavior element that is worth conveying and documenting. I'd be happy to
reword it if necessary.

I have this now

"""
984 This knob will not escape the range constraint imposed by sched_util_clamp_min
985 defined above.
986
987 For example if
988
989 sched_util_clamp_min_rt_default = 800
990 sched_util_clamp_min = 600
991
992 Then the boost will be clamped to 600 because 800 is outside of the permissible
993 range of [0:600]. This could happen for instance if a powersave mode will
994 restrict all boosts temporarily by modifying sched_util_clamp_min. As soon as
995 this restriction is lifted, the requested sched_util_clamp_min_rt_default
996 will take effect.
997
998 Any modification is applied lazily to currently running tasks and should be
999 visible by the next wakeup.
"""

Thanks

--
Qais Yousef