Re: [Patch v5 2/6] sched/fair: Add infrastructure to store and update instantaneous thermal pressure

From: Dietmar Eggemann
Date: Wed Nov 06 2019 - 07:50:32 EST


On 05/11/2019 22:53, Ionela Voinescu wrote:
> On Tuesday 05 Nov 2019 at 16:29:32 (-0500), Thara Gopinath wrote:
>> On 11/05/2019 04:15 PM, Ionela Voinescu wrote:
>>> On Tuesday 05 Nov 2019 at 16:02:00 (-0500), Thara Gopinath wrote:
>>>> On 11/05/2019 03:21 PM, Ionela Voinescu wrote:
>>>>> Hi Thara,
>>>>>
>>>>> On Tuesday 05 Nov 2019 at 13:49:42 (-0500), Thara Gopinath wrote:
>>>>> [...]
>>>>>> +static void trigger_thermal_pressure_average(struct rq *rq)
>>>>>> +{
>>>>>> +#ifdef CONFIG_SMP
>>>>>> + update_thermal_load_avg(rq_clock_task(rq), rq,
>>>>>> + per_cpu(thermal_pressure, cpu_of(rq)));
>>>>>> +#endif
>>>>>> +}
>>>>>
>>>>> Why did you decide to keep trigger_thermal_pressure_average and not
>>>>> call update_thermal_load_avg directly?
>>>>>
>>>>> For !CONFIG_SMP you already have an update_thermal_load_avg function
>>>>> that does nothing, in kernel/sched/pelt.h, so you don't need that
>>>>> ifdef.
>>>> Hi,
>>>>
>>>> Yes you are right. But later with the shift option added, I shift
>>>> rq_clock_task(rq) by the shift. I thought it is better to contain it in
>>>> a function that replicate it in three different places. I can remove the
>>>> CONFIG_SMP in the next version.
>>>
>>> You could still keep that in one place if you shift the now argument of
>>> ___update_load_sum instead.
>>
>> No. I cannot do this. The authors of the pelt framework prefers not to
>> include a shift parameter there. I had discussed this with Vincent earlier.
>>
>
> Right! I missed Vincent's last comment on this in v4.
>
> I would argue that it's more of a PELT parameter than a CFS parameter
> :), where it's currently being used. I would also argue that's more of a
> PELT parameter than a thermal parameter. It controls the PELT time
> progression for the thermal signal, but it seems more to configure the
> PELT algorithm, rather than directly characterize thermals.
>
> In any case, it only seemed to me that adding a wrapper function for
> this purpose only was not worth doing.

Coming back to the v4 discussion
https://lore.kernel.org/r/379d23e5-79a5-9d90-0fb6-125d9be85e99@xxxxxxx

There is no API between pelt.c and other parts of the scheduler/kernel
so why should we keep an unnecessary parameter and wrapper functions?

There is also no abstraction, update_thermal_load_avg() in pelt.c even
carries '_thermal_' in its name.

So why not define this shift value '[sched_|pelt_]thermal_decay_shift'
there as well? It belongs to update_thermal_load_avg().

All call sites of update_thermal_load_avg() use 'rq_clock_task(rq) >>
sched_thermal_decay_shift' so I don't see the need to pass it in.

IMHO, preparing for eventual code changes (e.g. parsing different now
values) is not a good practice in the kernel. Keeping the code small and
tidy is.