Re: [PATCH v4 6/8] sched/fair: Add sched group latency support

From: Dietmar Eggemann
Date: Tue Sep 20 2022 - 14:18:06 EST


On 19/09/2022 17:49, Vincent Guittot wrote:
> On Mon, 19 Sept 2022 at 13:55, Dietmar Eggemann
> <dietmar.eggemann@xxxxxxx> wrote:
>>
>> s/valentin.schneider@xxxxxxx//
>>
>> On 16/09/2022 10:03, Vincent Guittot wrote:
>>> Task can set its latency priority, which is then used to decide to preempt
>>> the current running entity of the cfs, but sched group entities still have
>>> the default latency offset.
>>>
>>> Add a latency field in task group to set the latency offset of the
>>> sched_eneities of the group, which will be used against other entities in
>>
>> s/sched_eneities/sched_entity
>>
>>> the parent cfs when deciding which entity to schedule first.
>>
>> So latency for cgroups does not follow any (existing) Resource
>> Distribution Model/Scheme (Documentation/admin-guide/cgroup-v2.rst)?
>> Latency values are only used to compare sched entities at the same level.
>
> Just like share/cpu.weight value does for time sharing

But for this we define it as following the `Weights` scheme. That's why
I was asking,

>> [...]
>>
>>> +static int cpu_latency_write_s64(struct cgroup_subsys_state *css,
>>> + struct cftype *cft, s64 latency)
>>> +{
>>
>> There is no [MIN, MAX] checking?
>
> This is done is sched_group_set_latency() which checks that
> abs(latency) < sysctl_sched_latency

I see. Nit-picking: Wouldn't this allow to specify a latency offset
value for the non-existent `nice = 20`? Highest nice value 19 maps to
`973/1024 * sysctl_sched_latency`.

>
>>
>> min_weight = sched_latency_to_weight[0] = -1024
>> max_weight = sched_latency_to_weight[39] = 973
>>
>> [MIN, MAX] = [sysctl_sched_latency * min_weight >> NICE_LATENCY_SHIFT,
>> sysctl_sched_latency * max_weight >> NICE_LATENCY_SHIFT]
>>
>>
>> With the `cpu.latency` knob user would have to know for example that the
>> value is -24,000,000ns to get the same behaviour as for a task latency
>> nice = -20 (latency prio = 0) (w/ sysctl_sched_latency = 24ms)?
>
> Yes, Tejun raised some concerns about adding an interface like nice in
> the task group in v2 so I have removed it.
>
>>
>> For `nice` we have `cpu.weight.nice` next to `cpu.weight` in cgroup v2 ?
>
> If everybody is ok, I can add back the cpu.latency.nice interface in
> the v5 in addition to the cpu.latency

cpu.weight/cpu.weight.nice interface:

echo X > cpu.weight tg->shares

1 10,240
100 1,048,576
10000 104,857,600

echo X > cpu.weight.nice

-20 90,891,264
0 1,048,576
19 15,360

Wouldn't then a similar interface for cpu.latency [1..100..10000] and
cpu.latency.nice [-20..0..19] make most sense?

Raw latency_offset values at interface level are not portable.