Re: [Patch v6 3/7] Add infrastructure to store and update instantaneous thermal pressure

From: Dietmar Eggemann
Date: Tue Dec 17 2019 - 07:54:34 EST


On 12/12/2019 05:11, Thara Gopinath wrote:

Shouldn't the subject contain something like 'arm,arm64,drivers:' ?

> Add architecture specific APIs to update and track thermal pressure on a
> per cpu basis. A per cpu variable thermal_pressure is introduced to keep
> track of instantaneous per cpu thermal pressure. Thermal pressure is the
> delta between maximum capacity and capped capacity due to a thermal event.
> capacity and capped capacity due to a thermal event.
>
> topology_get_thermal_pressure can be hooked into the scheduler specified
> arch_scale_thermal_capacity to retrieve instantaneius thermal pressure of

s/instantaneius/instantaneous

> a cpu.
>
> arch_set_thermal_presure can be used to update the thermal pressure by

s/presure/pressure

> providing a capped maximum capacity.
>
> Considering topology_get_thermal_pressure reads thermal_pressure and
> arch_set_thermal_pressure writes into thermal_pressure, one can argue for
> some sort of locking mechanism to avoid a stale value. But considering
> topology_get_thermal_pressure_average can be called from a system critical

s/topology_get_thermal_pressure_average/topology_get_thermal_pressure ?

> path like scheduler tick function, a locking mechanism is not ideal. This
> means that it is possible the thermal_pressure value used to calculate
> average thermal pressure for a cpu can be stale for upto 1 tick period.
>
> Signed-off-by: Thara Gopinath <thara.gopinath@xxxxxxxxxx>
> ---
>
> v3->v4:
> - Dropped per cpu max_capacity_info struct and instead added a per
> delta_capacity variable to store the delta between maximum
> capacity and capped capacity. The delta is now calculated when
> thermal pressure is updated and not every tick.
> - Dropped populate_max_capacity_info api as only per cpu delta
> capacity is stored.
> - Renamed update_periodic_maxcap to
> trigger_thermal_pressure_average and update_maxcap_capacity to
> update_thermal_pressure.
> v4->v5:
> - As per Peter's review comments folded thermal.c into fair.c.
> - As per Ionela's review comments revamped update_thermal_pressure
> to take maximum available capacity as input instead of maximum
> capped frequency ration.
> v5->v6:
> - As per review comments moved all the infrastructure to track
> and retrieve instantaneous thermal pressure out of scheduler
> to topology files.
>
> arch/arm/include/asm/topology.h | 3 +++
> arch/arm64/include/asm/topology.h | 3 +++
> drivers/base/arch_topology.c | 13 +++++++++++++
> include/linux/arch_topology.h | 11 +++++++++++
> 4 files changed, 30 insertions(+)
>
> diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
> index 8a0fae9..90b18c3 100644
> --- a/arch/arm/include/asm/topology.h
> +++ b/arch/arm/include/asm/topology.h
> @@ -16,6 +16,9 @@
> /* Enable topology flag updates */
> #define arch_update_cpu_topology topology_update_cpu_topology
>
> +/* Replace task scheduler's defalut thermal pressure retrieve API */
> +#define arch_scale_thermal_capacity topology_get_thermal_pressure
> +
> #else
>
> static inline void init_cpu_topology(void) { }
> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> index a4d945d..ccb277b 100644
> --- a/arch/arm64/include/asm/topology.h
> +++ b/arch/arm64/include/asm/topology.h
> @@ -25,6 +25,9 @@ int pcibus_to_node(struct pci_bus *bus);
> /* Enable topology flag updates */
> #define arch_update_cpu_topology topology_update_cpu_topology
>
> +/* Replace task scheduler's defalut thermal pressure retrieve API */
> +#define arch_scale_thermal_capacity topology_get_thermal_pressure
> +
> #include <asm-generic/topology.h>
>
> #endif /* _ASM_ARM_TOPOLOGY_H */
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 1eb81f11..3a91379 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -42,6 +42,19 @@ void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity)
> per_cpu(cpu_scale, cpu) = capacity;
> }
>
> +DEFINE_PER_CPU(unsigned long, thermal_pressure);
> +
> +void arch_set_thermal_pressure(struct cpumask *cpus,
> + unsigned long capped_capacity)
> +{
> + int cpu;
> + unsigned long delta = arch_scale_cpu_capacity(cpumask_first(cpus)) -
> + capped_capacity;

unsigned long delta;
int cpu;

delta = arch_scale_cpu_capacity(cpumask_first(cpus)) - capped_capacity;

Easier to read plus order local variable declarations from longest to
shortest line.

https://lore.kernel.org/r/20181107171149.165693799@xxxxxxxxxxxxx

> +
> + for_each_cpu(cpu, cpus)
> + WRITE_ONCE(per_cpu(thermal_pressure, cpu), delta);
> +}
> +
> static ssize_t cpu_capacity_show(struct device *dev,
> struct device_attribute *attr,
> char *buf)
> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index 3015ecb..7a04364 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -33,6 +33,17 @@ unsigned long topology_get_freq_scale(int cpu)
> return per_cpu(freq_scale, cpu);
> }
>
> +DECLARE_PER_CPU(unsigned long, thermal_pressure);
> +
> +static inline
> +unsigned long topology_get_thermal_pressure(int cpu)

Save one line:

static inline unsigned long topology_get_thermal_pressure(int cpu)

> +{
> + return per_cpu(thermal_pressure, cpu);
> +}
> +
> +void arch_set_thermal_pressure(struct cpumask *cpus,
> + unsigned long capped_capacity);
> +
> struct cpu_topology {
> int thread_id;
> int core_id;
>