Re: [Patch v5 2/6] sched/fair: Add infrastructure to store and update instantaneous thermal pressure

From: Thara Gopinath
Date: Wed Nov 06 2019 - 12:53:20 EST


On 11/06/2019 07:50 AM, Dietmar Eggemann wrote:
> On 05/11/2019 22:53, Ionela Voinescu wrote:
>> On Tuesday 05 Nov 2019 at 16:29:32 (-0500), Thara Gopinath wrote:
>>> On 11/05/2019 04:15 PM, Ionela Voinescu wrote:
>>>> On Tuesday 05 Nov 2019 at 16:02:00 (-0500), Thara Gopinath wrote:
>>>>> On 11/05/2019 03:21 PM, Ionela Voinescu wrote:
>>>>>> Hi Thara,
>>>>>>
>>>>>> On Tuesday 05 Nov 2019 at 13:49:42 (-0500), Thara Gopinath wrote:
>>>>>> [...]
>>>>>>> +static void trigger_thermal_pressure_average(struct rq *rq)
>>>>>>> +{
>>>>>>> +#ifdef CONFIG_SMP
>>>>>>> + update_thermal_load_avg(rq_clock_task(rq), rq,
>>>>>>> + per_cpu(thermal_pressure, cpu_of(rq)));
>>>>>>> +#endif
>>>>>>> +}
>>>>>>
>>>>>> Why did you decide to keep trigger_thermal_pressure_average and not
>>>>>> call update_thermal_load_avg directly?
>>>>>>
>>>>>> For !CONFIG_SMP you already have an update_thermal_load_avg function
>>>>>> that does nothing, in kernel/sched/pelt.h, so you don't need that
>>>>>> ifdef.
>>>>> Hi,
>>>>>
>>>>> Yes you are right. But later with the shift option added, I shift
>>>>> rq_clock_task(rq) by the shift. I thought it is better to contain it in
>>>>> a function that replicate it in three different places. I can remove the
>>>>> CONFIG_SMP in the next version.
>>>>
>>>> You could still keep that in one place if you shift the now argument of
>>>> ___update_load_sum instead.
>>>
>>> No. I cannot do this. The authors of the pelt framework prefers not to
>>> include a shift parameter there. I had discussed this with Vincent earlier.
>>>
>>
>> Right! I missed Vincent's last comment on this in v4.
>>
>> I would argue that it's more of a PELT parameter than a CFS parameter
>> :), where it's currently being used. I would also argue that's more of a
>> PELT parameter than a thermal parameter. It controls the PELT time
>> progression for the thermal signal, but it seems more to configure the
>> PELT algorithm, rather than directly characterize thermals.
>>
>> In any case, it only seemed to me that adding a wrapper function for
>> this purpose only was not worth doing.
>
> Coming back to the v4 discussion
> https://lore.kernel.org/r/379d23e5-79a5-9d90-0fb6-125d9be85e99@xxxxxxx
>
> There is no API between pelt.c and other parts of the scheduler/kernel
> so why should we keep an unnecessary parameter and wrapper functions?
>
> There is also no abstraction, update_thermal_load_avg() in pelt.c even
> carries '_thermal_' in its name.
>
> So why not define this shift value '[sched_|pelt_]thermal_decay_shift'
> there as well? It belongs to update_thermal_load_avg().
>
> All call sites of update_thermal_load_avg() use 'rq_clock_task(rq) >>
> sched_thermal_decay_shift' so I don't see the need to pass it in.
>
> IMHO, preparing for eventual code changes (e.g. parsing different now
> values) is not a good practice in the kernel. Keeping the code small and
> tidy is.

I think we are going in circles on this one. I acknowledge you have an
issue. That being said, I also understand the need to keep the pelt
framework code tight. Also Ionela pointed out that there could be a need
for a faster decay in which case it could mean a left shift leading to
further complications if defined in pelt.c (I am not saying that I will
implement a faster decay in this patch set but it is more of a future
extension if needed!)

I can make trigger_thermal_pressure_average inline if that will
alleviate some of the concerns.

I would also urge you to reconsider the merits of arguing this point
back and forth.
>


--
Warm Regards
Thara