Re: v5.7: new core kernel option missing help text

From: Thara Gopinath
Date: Wed Jun 03 2020 - 20:50:47 EST




On 6/3/20 4:25 PM, Valentin Schneider wrote:

On 03/06/20 20:58, Russell King - ARM Linux admin wrote:
On Wed, Jun 03, 2020 at 09:24:56PM +0200, Vincent Guittot wrote:
On Wed, 3 Jun 2020 at 20:45, Russell King - ARM Linux admin
<linux@xxxxxxxxxxxxxxx> wrote:
It's a start. I'm still wondering whether I should answer yes or no
for the platforms I'm building for.

So far, all I've found is:

arch/arm/include/asm/topology.h:#define arch_scale_thermal_pressure topology_get_thermal_pressure

which really doesn't tell me anything about this. So I'm still in
the dark.

I guess topology_get_thermal_pressure is provided by something in
drivers/ which will be conditional on some driver or something.

You need cpufreq_cooling device to make it useful and only for SMP
I don't think that this should not be user configurable because even
with the description above, it is not easy to choose.
This should be set by the driver that implement the feature which is
only cpufreq cooling device for now it

As I have CONFIG_CPU_FREQ_THERMAL=y in my config, I'm guessing (and it's
only a guess) that I should say y to SCHED_THERMAL_PRESSURE ?


arm and arm64 implement arch_scale_thermal_pressure(); the actual
implementation is in the arch_topology "driver" (GENERIC_ARCH_TOPOLOGY).

Then, the caller of arch_set_thermal_pressure() is cpufreq_cooling (see
below); that'll only get called if you have thermal zones using CPU
cooling devices.

AFAICT the current state of things imply we should have something like

depends on (ARM || ARM64) && GENERIC_ARCH_TOPOLOGY

for that option.

Hi Russel/Valentin

The feature itself like Valentin explained below allows scheduler to be aware of cpu capacity reduced due to thermal throttling. arch_set_thermal_pressure feeds the capped capacity to the scheduler and hence the feature makes sense only if arch_set_thermal_pressure is implemented. Having said that arch_set_thermal_pressure is implemented in arch_topology driver for arm and arm64 platforms. But the feature itself is not bound to arm/arm64 platforms. So it would make it wrong to add a "depends on (ARM || ARM64) option."

I agree with Vincent that allowing user to choose this option is probably not the best. IMO, this should be enabled by default in arm64 defconfig considering both GENERIC_ARCH_TOPOLOGY and CPU_FREQ_THERMAL are enabled by default.
So if it is acceptable three things to be done are:
1. Add the help text.
2. Don't allow SCHED_THERMAL_PRESSURE configurable by user
3. Enable it by default in arm64 defconfig


+ help
+ This option allows the scheduler to be aware of CPU thermal throttling
+ (i.e. thermal pressure), providing arch_scale_thermal_pressure() is
+ implemented.

Is this feature documented in terms of what it does? Do I assume that
as the thermal trip points start tripping, that has an influence on
the scheduler? Or is it the case that the scheduler is wanting to
know when the cpu frequency changes?

Grepping for "thermal" in Documentation/scheduler brings up nothing.

The former; changing a CPU cooling device's state (IOW changing its max
allowed frequency for thermal reasons) leads to a call to
arch_set_thermal_pressure() (see
cpufreq_cooling.c::cpufreq_set_cur_state()).

It's somewhat interesting to have, at least in theory. On plain SMP that
would let the scheduler see if some CPUs are more throttled that others,
which would be leveraged when doing load balancing. It's more
interesting for big.LITTLE & co, where in the worst cases we can have
things like capacity inversion, i.e. the bigs are so thermally throttled
that they give less oomf than a LITTLE.


--
Warm Regards
Thara