Re: [PATCH v2 2/2] arm64: dts: qcom: pm8998: Add pm8998 thermal zone

From: Matthias Kaehlcke
Date: Mon Jul 02 2018 - 18:35:08 EST


On Mon, Jul 02, 2018 at 01:46:11PM -0700, Matthias Kaehlcke wrote:
> On Mon, Jul 02, 2018 at 12:53:44PM -0700, Doug Anderson wrote:
> > Hi,
> >
> > On Mon, Jul 2, 2018 at 11:10 AM, Matthias Kaehlcke <mka@xxxxxxxxxxxx> wrote:
> > > The thermal zone uses spmi-temp-alarm as sensor. If the sensor is
> > > configured without an IIO input it always reports 37ÂC for temperatures
> > > below the first hardware trip point at 105ÂC. This hardware trip point
> > > is configured as critical trip point, to initiate a system shutdown
> > > before the temperature reaches the next hardware trip point at 125ÂC,
> > > where the PMIC performs a partial shutdown.
> > >
> > > The temperature of the critical trip point can be raised after adding
> > > the die temperature ADC as IIO input for spmi-temp-alarm, which
> > > significantly increases the precision of the temperature measurements.
> > >
> > > Signed-off-by: Matthias Kaehlcke <mka@xxxxxxxxxxxx>
> > > ---
> > > Changes in v2:
> > > - defined 'thermal-zones' node in pm8998.dtsi instead of using a label
> > > to refer to it
> > > - use 105ÂC hardware trip point as critical trip point
> >
> > I'm not sure this was right. I guess you're trying to avoid
> > Temperature Stage 2?
>
> Indeed
>
> > From Davi'd email in response to v1:
> >
> > > The PMIC TEMP_ALARM hardware peripheral will perform an automatic partial
> > > PMIC shutdown upon hitting over-temperature stage 2 (125 C). This turns
> > > off peripherals within the PMIC that are expected to draw significant
> > > current. The set of peripherals included varies between PMICs. This
> > > partial shutdown will occur simultaneously with the triggering of an
> > > interrupt to the APPS processor that informs the qcom-spmi-temp-alarm
> > > driver that an over-temperature threshold has been crossed.
> >
> > I think it's actually OK to use Temperature Stage 2 as the "critical"
> > point, which is why it still interrupts the CPU. At "critical" the
> > system will shut down, right? ...so presumably it's OK if the drivers
> > can't recover from having the power yanked out from underneath them as
> > long as they don't hang/crash the system in this case. If I had to
> > guess the whole point of this stage is to give the system shutdown a
> > better chance of succeeding without getting to stage 3.
>
> That was my starting point, however in my tests the system reset
> several times when the temperature got close to 125ÂC, not allowing
> for a proper shutdown. Apparently the partial shutdown of the PMIC can
> result in a full reset at least on some systems.

For the record: Linux does a proper shutdown when software override
for stage 2 is enabled (bit OVRD_ST2_EN in TEMP_ALARM_SHUTDOWN_CTL1).