Re: [PATCH 1/3] thermal: intel: int340x: Add platform temperature control interface

From: srinivas pandruvada
Date: Fri Mar 14 2025 - 10:30:53 EST


On Tue, 2025-03-11 at 05:56 +0000, Zhang, Rui wrote:
> On Sat, 2025-03-08 at 10:38 -0800, Srinivas Pandruvada wrote:
> > Platform Temperature Control is a dynamic control loop implemented
> > in
> > hardware to manage the skin or any board temperature of a device.
> > The
> > reported skin or board temperature is controlled by comparing to a
> > configured target temperature and adjusting the SoC (System on
> > Chip)
> > performance accordingly. The feature supports up to three platform
> > sensors.
> >
> > OEMs (Original Equipment Manufacturers) can configure this feature
> > through the BIOS and provide temperature input directly to the
> > hardware
> > via the Platform Environment Control Interface (PECI).
>
> Does this mean each PTC instance is bound to a certain skin/board
> sensor?
Yes.

> And writing the "temperature_target" sysfs tells firmware the
> target temperature, as well as the target sensor that the temperature
> applies to? 
Yes

> if yes, is there a way for userspace to know which sensor
> each PTC instance applies to?
No.
You can just change threshold up and down by observation only.

Thanks,
Srinivas

>
> >  As a result,
> > this feature can operate independently of any OS-level control.
> >
> > The OS interface can be used to further fine-tune the default OEM
> > configuration. Here are some scenarios where the OS interface is
> > beneficial:
> > Verification of Firmware Control: Check if firmware-based control
> > is
> > enabled. If it is, thermal controls from the OS/user space can be
> > backed out.
> > Adjusting Target Limits: While OEMs can set an aggressive target
> > limit,
> > the OS can adjust this to a less aggressive limit based on
> > operating
> > modes or conditions.
> >
> > The hardware control interface is via a MMIO offsets via processor
> > thermal device. There are three instances of MMIO registers. All
> > are 64 bit registers
> >
> > Name: PLATFORM_TEMPERATURE_CONTROL
> > Offsets: 0x5B20, 0x5B28, 0x5B30
> >
> > All values have RW access
> >
> > Bits    Description
> > 7:0     TARGET_TEMP : Target temperature limit to which the control
> >         mechanism is regulating. Units: 0.5C.
> > 8:8     ENABLE: Read current enable status of the feature or enable
> > feature.
> > 11:9 GAIN: Sets the aggressiveness of control loop from 0 to 7
> > 7 graceful, favors performance at the expense of
> > temperature
> > overshoots
> > 0 aggressive, favors tight regulation over performance
> > 12:12 TEMPERATURE_OVERRIDE_EN
> > When set, hardware will use TEMPERATURE_OVERRIDE values
> > instead
> > of reading from corresponding sensor.
> > 15:13 RESERVED
> > 23:16 MIN_PERFORMANCE_LEVEL: Minimum Performance level below
> > which
> > the
> > there will be no throttling. 0 - all levels of throttling
> > allowed
> > including survivability actions. 255 - no throttling
> > allowed.
> > 31:24 TEMPERATURE_OVERRIDE: Allows SW to override the input
> > temperature.
> > hardware will use this value instead of the sensor
> > temperature.
> > Units: 0.5C.
> > 63:32 RESERVED
> >
> > Out of the above controls, only "enable" and "temperature_target"
> > is
> > exposed via sysfs.
> > There are three instances of this controls. So up to three
> > different
> > sensors can be controlled independently.
> >
> > Sysfs interface:
> > tree
> > /sys/bus/pci/devices/0000\:00\:04.0/platform_temperature_?_control/
> > /sys/bus/pci/devices/0000:00:04.0/platform_temperature_0_control/
> > ├── enable
> > ├── temperature_target
> > /sys/bus/pci/devices/0000:00:04.0/platform_temperature_1_control/
> > ├── enable
> > ├── temperature_target
> > /sys/bus/pci/devices/0000:00:04.0/platform_temperature_2_control/
> > ├── enable
> > ├── temperature_target
> >
> > Description of attributes:
> >
> > Enable: 1 for enable, 0 for disable. This attribute can be used to
> > read the current status. User space can write 0 or 1 to disable or
> > enable this feature respectively.
> > temperature_target: Target temperature limit to which hardware
> > will try to limit in milli degree C.
> >
> > Signed-off-by: Srinivas Pandruvada
> > <srinivas.pandruvada@xxxxxxxxxxxxxxx>
> > ---
> >  .../thermal/intel/int340x_thermal/Makefile    |   1 +
> >  .../platform_temperature_control.c            | 181
> > ++++++++++++++++++
> >  .../processor_thermal_device.c                |  15 +-
> >  .../processor_thermal_device.h                |   3 +
> >  4 files changed, 199 insertions(+), 1 deletion(-)
> >  create mode 100644
> > drivers/thermal/intel/int340x_thermal/platform_temperature_control.
> > c
> >
> > diff --git a/drivers/thermal/intel/int340x_thermal/Makefile
> > b/drivers/thermal/intel/int340x_thermal/Makefile
> > index fe3f43924525..184318d1792b 100644
> > --- a/drivers/thermal/intel/int340x_thermal/Makefile
> > +++ b/drivers/thermal/intel/int340x_thermal/Makefile
> > @@ -9,6 +9,7 @@ obj-$(CONFIG_INT340X_THERMAL) +=
> > processor_thermal_device_pci_legacy.o
> >  obj-$(CONFIG_INT340X_THERMAL) += processor_thermal_device_pci.o
> >  obj-$(CONFIG_PROC_THERMAL_MMIO_RAPL) += processor_thermal_rapl.o
> >  obj-$(CONFIG_INT340X_THERMAL) += processor_thermal_rfim.o
> > +obj-$(CONFIG_INT340X_THERMAL) += platform_temperature_control.o
> >  obj-$(CONFIG_INT340X_THERMAL) += processor_thermal_mbox.o
> >  obj-$(CONFIG_INT340X_THERMAL) += processor_thermal_wt_req.o
> >  obj-$(CONFIG_INT340X_THERMAL) += processor_thermal_wt_hint.o
> > diff --git
> > a/drivers/thermal/intel/int340x_thermal/platform_temperature_contro
> > l.
> > c
> > b/drivers/thermal/intel/int340x_thermal/platform_temperature_contro
> > l.
> > c
> > new file mode 100644
> > index 000000000000..dd3ea7165800
> > --- /dev/null
> > +++
> > b/drivers/thermal/intel/int340x_thermal/platform_temperature_contro
> > l.
> > c
> > @@ -0,0 +1,181 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * processor thermal device platform temperature controls
> > + * Copyright (c) 2025, Intel Corporation.
> > + */
> > +
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +#include <linux/pci.h>
> > +#include "processor_thermal_device.h"
> > +
> > +struct mmio_reg {
> > + int bits;
> > + u16 mask;
> > + u16 shift;
> > + u16 units;
> > +};
> > +
> > +#define MAX_ATTR_GROUP_NAME_LEN 32
> > +
> > +struct ptc_data {
> > + u32 offset;
> > + struct attribute_group ptc_attr_group;
> > + struct attribute *ptc_attrs[3];
> > + struct device_attribute temperature_target_attr;
> > + struct device_attribute enable_attr;
> > + char group_name[MAX_ATTR_GROUP_NAME_LEN];
> > +};
> > +
> > +static const struct mmio_reg ptc_mmio_regs[] = {
> > + { 8, 0xFF, 0, 500}, /* temperature_target, units 0.5C*/
> > + { 1, 0x01, 8, 0}, /* enable */
> > + { 3, 0x7, 9, 0}, /* gain */
> > + { 8, 0xFF, 16, 0}, /* min_performance_level */
> > + { 1, 0x1, 12, 0}, /* temperature_override_enable */
> > + { 8, 0xFF, 24, 500}, /* temperature_override, units 0.5C
> > */
> > +};
> > +
> > +/* Unique offset for each PTC instance */
> > +static u32 ptc_offsets[] = {0x5B20, 0x5B28, 0x5B30};
>
> I'd prefer to define PTC_MAX_INSTANCES earlier and use
>   static u32 ptc_offsets[PTC_MAX_INSTANCES] = {0x5B20, 0x5B28,
> 0x5B30};
> instead.
>
> thanks,
> rui