Re: [EXT] Re: [PATCH v6] arm64: dts: ls1088a: add one more thermal zone node

From: Viresh Kumar
Date: Fri May 10 2019 - 06:13:45 EST


On 10-05-19, 08:47, Andy Tang wrote:
> + Viresh for help.
>
> > -----Original Message-----
> > From: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx>
> > Sent: 2019å5æ10æ 15:17
> > To: Andy Tang <andy.tang@xxxxxxx>; Shawn Guo <shawnguo@xxxxxxxxxx>
> > Cc: Leo Li <leoyang.li@xxxxxxx>; robh+dt@xxxxxxxxxx;
> > mark.rutland@xxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
> > devicetree@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > linux-pm@xxxxxxxxxxxxxxx; rui.zhang@xxxxxxxxx; edubezval@xxxxxxxxx
> > Subject: Re: [EXT] Re: [PATCH v6] arm64: dts: ls1088a: add one more thermal
> > zone node
> >
> > Caution: EXT Email
> >
> > On 10/05/2019 05:40, Andy Tang wrote:
> > >> -----Original Message-----
> > >> From: Shawn Guo <shawnguo@xxxxxxxxxx>
> > >> Sent: 2019å5æ10æ 11:14
> > >> To: Andy Tang <andy.tang@xxxxxxx>
> > >> Cc: Leo Li <leoyang.li@xxxxxxx>; robh+dt@xxxxxxxxxx;
> > >> mark.rutland@xxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
> > >> devicetree@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > >> linux-pm@xxxxxxxxxxxxxxx; daniel.lezcano@xxxxxxxxxx;
> > >> rui.zhang@xxxxxxxxx; edubezval@xxxxxxxxx
> > >> Subject: [EXT] Re: [PATCH v6] arm64: dts: ls1088a: add one more
> > >> thermal zone node
> > >>
> > >> Caution: EXT Email
> > >>
> > >> On Tue, Apr 23, 2019 at 10:25:07AM +0800, Yuantian Tang wrote:
> > >>> Ls1088a has 2 thermal sensors, core cluster and SoC platform. Core
> > >>> cluster sensor is used to monitor the temperature of core and SoC
> > >>> platform is for platform. The current dts only support the first sensor.
> > >>> This patch adds the second sensor node to dts to enable it.
> > >>>
> > >>> Signed-off-by: Yuantian Tang <andy.tang@xxxxxxx>
> > >>> ---
> > >>> v6:
> > >>> - add cooling device map to cpu0-7 in platform node.
> > > I like to explain a little. I think it makes sense that multiple thermal zone
> > map to same cooling device.
> > > In this way, no matter which thermal zone raises a temp alarm, it can call
> > cooling device to chill out.
> > > I also asked cpufreq maintainer about the cooling map issue, he think it
> > would be fine.

Yes, you asked me and I said it should be okay.

> > > I have tested and no issue found.
> > >
> > > Daniel, what's your thought?
> >
> > If there are multiple thermal zones, they will be managed by different
> > instances of a thermal governor. Each instances will act on the shared cooling
> > device and will collide in their decisions:
> >
> > - If the sensors are closed, their behavior will be similar regarding the
> > temperature. The governors may take the same decision for the cooling
> > device. But in such case having just one thermal zone managed is enough.
> >
> > - If the sensors are not closed, their behavior will be different regarding the
> > temperature. The governors will take different decision regarding the cooling
> > device (one will decrease the freq, other will increase the freq).
> >
> > As the thermal governors are not able to manage several thermal zones and
> > there is one cooling device (the cpu cooling device), this setup won't work as
> > expected IMO.
> >
> > The setup making sense is having a thermal zone per 'cluster' and a cooling
> > device per 'cluster'. That means the platform has one clock line per 'cluster'.
> > The thermal management happens in a self-contained thermal zone (one
> > cooling device - one governor - one thermal zone).
> >
> > In the case of HMP, other combinations are possible to be optimal.

But not sure how I missed the obvious, though I do remember thinking about this.

So the problem is that the cpu_cooling driver will get requests in parallel to
set different max frequencies and the last call will always win and may result
in undesired outcome.

Sorry about creating the confusion.

--
viresh