Re: [PATCH v4 0/6] RK3588 and Rock 5B dts additions: thermal, OPP and fan

From: Quentin Schulz
Date: Tue May 28 2024 - 12:08:25 EST


Hi all,

On 5/28/24 5:42 PM, Alexey Charkov wrote:
On Tue, 28 May 2024 at 19:16, Heiko Stuebner <heiko@xxxxxxxxx> wrote:

Am Dienstag, 28. Mai 2024, 17:01:48 CEST schrieb Dragan Simic:
On 2024-05-28 16:34, Heiko Stuebner wrote:
Am Dienstag, 28. Mai 2024, 16:05:04 CEST schrieb Dragan Simic:
On 2024-05-28 11:49, Alexey Charkov wrote:
On Mon, May 6, 2024 at 1:37 PM Alexey Charkov <alchark@xxxxxxxxx>
wrote:

This enables thermal monitoring and CPU DVFS on RK3588(s), as well
as
active cooling on Radxa Rock 5B via the provided PWM fan.

Some RK3588 boards use separate regulators to supply CPUs and their
respective memory interfaces, so this is handled by coupling those
regulators in affected boards' device trees to ensure that their
voltage is adjusted in step.

This also enables the built-in thermal sensor (TSADC) for all
boards
that don't currently have it enabled, using the default CRU based
emergency thermal reset. This default configuration only uses
on-SoC
devices and doesn't rely on any external wiring, thus it should
work
for all devices (tested only on Rock 5B though).

The boards that have TSADC_SHUT signal wired to the PMIC reset line
can choose to override the default reset logic in favour of GPIO
driven (PMIC assisted) reset, but in my testing it didn't work on
Radxa Rock 5B - maybe I'm reading the schematic wrong and it
doesn't
support PMIC assisted reset after all.

Fan control on Rock 5B has been split into two intervals: let it
spin
at the minimum cooling state between 55C and 65C, and then
accelerate
if the system crosses the 65C mark - thanks to Dragan for
suggesting.
This lets some cooling setups with beefier heatsinks and/or larger
fan fins to stay in the quietest non-zero fan state while still
gaining potential benefits from the airflow it generates, and
possibly avoiding noisy speeds altogether for some workloads.

OPPs help actually scale CPU frequencies up and down for both
cooling
and performance - tested on Rock 5B under varied loads. I've
dropped
those OPPs that cause frequency reductions without accompanying
decrease
in CPU voltage, as they don't seem to be adding much benefit in
day to
day use, while the kernel log gets a number of "OPP is inefficient"
lines.

Note that this submission doesn't touch the SRAM read margin
updates
or
the OPP calibration based on silicon quality which the downstream
driver
does and which were mentioned in [1]. It works as it is (also
confirmed by
Sebastian in his follow-up message [2]), and it is stable in my
testing on
Rock 5B, so it sounds better to merge a simple version first and
then
extend when/if required.

[1]

https://lore.kernel.org/linux-rockchip/CABjd4YzTL=5S7cS8ACNAYVa730WA3iGd5L_wP1Vn9=f83RCORA@xxxxxxxxxxxxxx/
[2]

https://lore.kernel.org/linux-rockchip/pkyne4g2cln27dcdu3jm7bqdqpmd2kwkbguiolmozntjuiajrb@gvq4nupzna4o/

Signed-off-by: Alexey Charkov <alchark@xxxxxxxxx>
---

Hi Heiko,

Do you think this can be merged for 6.11? Looks like there hasn't
been
any new feedback in a while, and it would be good to have frequency
scaling in place for RK3588.

Please let me know if you have any reservations or if we need any
broader discussion.

not really reservations, more like there was still discussion going on
around the OPPs. Meanwhile we had more discussions regarding the whole
speed binning Rockchip seems to do for rk3588 variants.

And waiting for the testing Dragan wanted to do ;-) .

I'm sorry for the delays.

Was definitly _not_ meant as blame ;-) .

The series has just too many discussions threads to unravel on half
an afternoon.


FWIW, I think the latest exchange we had with Quentin regarding the OPPs
concluded in “false alarm”, given that this version of the series only
introduces a subset of them which should apply to all RK3588(s)


Correct.

However... I'm wondering if we shouldn't somehow follow the same pattern we have used for the rk3399 OPPs? We have a file for the "true" RK3399 OPPs, then the OP1 variant and the RK3399T.

We already know there are a few variants of RK3588 with different OPPs: RK3588(S/S2?), RK3588J and RK3588M. I wouldn't be surprised if the RK3582 (though this one has already one big cluster (or two big cores) fewer than RK3588) has different OPPs as well?

So. We have already discussed that the OPPs in that patch are valid for RK3588(S) but they aren't for the other variants.

In the downstream kernel, any OPP whose opp-supported-hw has a first value masked by BIT(1) return non-0 is supported by RK3588M. In the downstream kernel, any OPP whose opp-supported-hw has a first value masked by BIT(2) return non-0 is supported by RK3588J.

This means that, for LITTLE clusters:
- opp-1608000000 not supported on RK3588J
- opp-1704000000 only supported on RK3588M (but already absent in this patch series)
- opp-1800000000 only supported on RK3588(S), not RK3588J nor RK3588M

For big clusters:
- opp-1800000000 not supported on RK3588J
- opp-2016000000 not supported on RK3588J
- opp-2208000000 only supported on RK3588(S), not RK3588J nor RK3588M
- opp-2256000000 only supported on RK3588(S), not RK3588J nor RK3588M
- opp-2304000000 only supported on RK3588(S), not RK3588J nor RK3588M
- opp-2352000000 only supported on RK3588(S), not RK3588J nor RK3588M
- opp-2400000000 only supported on RK3588(S), not RK3588J nor RK3588M

This is somehow also enforced in downstream kernel by removing the OPP nodes directly (hence, not even requiring the check of opp-supported-hw value), c.f.:
https://git.theobroma-systems.com/tiger-linux.git/tree/arch/arm64/boot/dts/rockchip/rk3588j.dtsi
https://git.theobroma-systems.com/tiger-linux.git/tree/arch/arm64/boot/dts/rockchip/rk3588m.dtsi

You'll not that the RK3588J also has less OPPs for the GPU and NPU (but those should also be masked by the opp-supported-hw value).

Cheers,
Quentin