Re: [PATCH 2/4] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588
From: Alexey Charkov
Date: Thu Jan 25 2024 - 09:47:34 EST
On Thu, Jan 25, 2024 at 2:02 PM Daniel Lezcano
<daniel.lezcano@xxxxxxxxxx> wrote:
>
> On 25/01/2024 09:26, Alexey Charkov wrote:
> > On Thu, Jan 25, 2024 at 1:56 AM Daniel Lezcano
> > <daniel.lezcano@xxxxxxxxxx> wrote:
> >>
> >> On 24/01/2024 21:30, Alexey Charkov wrote:
> >>> Include thermal zones information in device tree for rk3588 variants
> >>
> >> There is an energy model for the CPUs. But finding out the sustainable
> >> power may be a bit tricky. So I suggest to remove everything related to
> >> the power allocator in this change and propose a dedicated change with
> >> all the power configuration (which includes proper k_p* coefficients to
> >> be set from userspace to have a flat mitigation figure).
> >>
> >> That implies removing the "contribution" properties in this description.
> >
> > Alright, I'll just drop those "contribution" properties, thanks!
> >
> >> Some comments below but definitively this version is close to be ok.
> >
> > Yay! :)
> >
> >>> Signed-off-by: Alexey Charkov <alchark@xxxxxxxxx>
> >>> ---
> >>> arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 165 ++++++++++++++++++++++++++++++
> >>> 1 file changed, 165 insertions(+)
> >>>
> >>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> >>> index 36b1b7acfe6a..131b9eb21398 100644
> >>> --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> >>> +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> >>> @@ -10,6 +10,7 @@
> >>> #include <dt-bindings/reset/rockchip,rk3588-cru.h>
> >>> #include <dt-bindings/phy/phy.h>
> >>> #include <dt-bindings/ata/ahci.h>
> >>> +#include <dt-bindings/thermal/thermal.h>
> >>>
> >>> / {
> >>> compatible = "rockchip,rk3588";
> >>> @@ -2228,6 +2229,170 @@ tsadc: tsadc@fec00000 {
> >>> status = "disabled";
> >>> };
> >>>
> >>> + thermal_zones: thermal-zones {
> >>> + /* sensor near the center of the whole chip */
> >>> + package_thermal: package-thermal {
> >>> + polling-delay-passive = <0>;
> >>> + polling-delay = <0>;
> >>> + thermal-sensors = <&tsadc 0>;
> >>> +
> >>> + trips {
> >>> + package_crit: package-crit {
> >>> + temperature = <115000>;
> >>> + hysteresis = <0>;
> >>> + type = "critical";
> >>> + };
> >>> + };
> >>> + };
> >>> +
> >>> + /* sensor between A76 cores 0 and 1 */
> >>> + bigcore0_thermal: bigcore0-thermal {
> >>> + polling-delay-passive = <20>;
> >>
> >> 20ms seems very short, is this value on purpose? Or just picked up
> >> arbitrarily?
> >
> > Frankly, I simply used the value that Radxa's downstream DTS sets for
> > my board. 100ms seem to work just as well.
> >
> >> If it is possible, perhaps you should profile the temperature of these
> >> thermal zones (CPUs ones). There is a tool in
> >> <linuxdir>/tools/thermal/thermometer to do that.
> >>
> >> You can measure with 10ms sampling rate when running for instance
> >> dhrystone pinned on b0 and b1, then on b2 and b3. And finally on the
> >> small cluster.
> >
> > It seems tricky to isolate the effects from just one of the CPU
> > clusters, as their individual thermal outputs are not that high.
> >
> > For my testing I disabled the fan (but didn't remove the heatsink to
> > avoid wasting the thermal interface tape),
>
> It is ok but the system will have more heat capacity and it will be
> necessary to saturate it before running the tests. IOW warm up the
> system by running thermal stress tests several times.
>
> > and tried loading CPUs with
> > stress-ng. Here are the observations:
>
> Usually I use drhystone to thermal stress the cores (e. one minute).
Hmm, could you please point to the source package or repo to get the
version of dhrystone you use? I could only find the old shar [1] and a
Debian version with added Makefile [2], but neither seems to produce
multiple threads.
It doesn't seem to be packaged for either Gentoo or Fedora unfortunately.
Indeed, I'm getting higher thermal load (vs. stress-ng --cpu) even by
simply compiling kernel sources, although I'd expect it to wait for
memory and/or IO quite a lot.
[1] https://www.netlib.org/benchmark/dhry-c
[2] https://github.com/qris/dhrystone-deb
Thanks a lot,
Alexey