Re: [Patch v9 7/8] sched/fair: Enable tuning of decay period

From: Dietmar Eggemann
Date: Fri Feb 14 2020 - 05:27:03 EST


On 13/02/2020 14:54, Thara Gopinath wrote:
> On 02/10/2020 06:59 AM, Dietmar Eggemann wrote:
>> On 07/02/2020 23:42, Thara Gopinath wrote:
>>> On 02/04/2020 03:39 AM, Dietmar Eggemann wrote:
>>>> On 03/02/2020 16:55, Peter Zijlstra wrote:
>>>>> On Mon, Feb 03, 2020 at 07:07:57AM -0500, Thara Gopinath wrote:
>>>>>> On 01/28/2020 06:56 PM, Randy Dunlap wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 1/28/20 2:36 PM, Thara Gopinath wrote:

[...]

>> is really not saying from which review comment the individual changes in
>> the function name are coming from. And I don't see an answer to Ionela's
>> email saying that her proposal will manifest in a particular part of
>> this change.
> Hi Dietmar,
>
> Like I said, don't want to argue on name. It is trivial for me. I have
> v10 prepped with the name change. Will send it out shortly.

Thanks.

[...]

>> Cpu-invariant accounting can't be guarded with a kernel CONFIG switch.
>> Frequency-invariant accounting could be with CONFIG_CPU_FREQ but this is
>> enabled by default by Arm64 defconfig.
>> Thermal pressure (accounting) (CONFIG_HAVE_SCHED_THERMAL_PRESSURE) is
>> disabled by default so why should a per-cpu thermal_pressure be
>> maintained on such a system (CONFIG_CPU_THERMAL=y by default)?
>
> I agree that there is no need for per-cpu thermal pressure to be
> maintained if no averaging is happening in the scheduler, today. I don't
> know if there will ever be an use for it.

All arch_scale_FOO() functions follow the approach to force the arch
(currently x86, arm, arm64) to do

#define arch_scale_FOO BAR

to enable the FOO functionality.

There is no direct link between consumer and provider here.

consumer (sched) -> arch <- provider (arch, counters, CPUfreq, CPU
cooling, etc.)

So IMHO, FOO=thermal_pressure should follow this design pattern too.

'thermal_pressure' would be the only one which can be disabled by a
kernel config switch at the consumer side.
IMHO, it doesn't make sense to have the provider operating in this case.

> My issue has to do with using a config option meant for internal
> scheduler code being used else where. To me, once this happens, the
> entire work done to separate out reading and writing of instantaneous
> thermal pressure to arch_topology makes no sense. We could have kept it
> in scheduler itself.

You might see thermal_pressure more on the level of irq_load or
[rt/dl]_rq_load and that could be why we have a different opinion here?

Now rt_rq_load and dl_rq_load are scheduler internal providers and
irq_load is driven by 'irq_delta + steal' time (which is much closer to
the scheduler than thermal for instance).

My assumption is that we don't want a direct link between the scheduler
and e.g. a provider 'thermal'.

> Another way I think about this whole thermal pressure framework is that
> it is the job of cooling device or cpufreq or any other entity to update
> a throttle in maximum pressure to the scheduler. It should be
> independent of what scheduler does with it. Scheduler can choose to
> ignore it