Re: [PATCH v4 0/6] RK3588 and Rock 5B dts additions: thermal, OPP and fan
From: Alexey Charkov
Date: Tue May 28 2024 - 15:26:40 EST
On Tue, May 28, 2024 at 8:08 PM Quentin Schulz <quentin.schulz@xxxxxxxxx> wrote:
>
> Hi all,
>
> On 5/28/24 5:42 PM, Alexey Charkov wrote:
> > On Tue, 28 May 2024 at 19:16, Heiko Stuebner <heiko@xxxxxxxxx> wrote:
> >
> >> Am Dienstag, 28. Mai 2024, 17:01:48 CEST schrieb Dragan Simic:
> >>> On 2024-05-28 16:34, Heiko Stuebner wrote:
> >>>> Am Dienstag, 28. Mai 2024, 16:05:04 CEST schrieb Dragan Simic:
> >>>>> On 2024-05-28 11:49, Alexey Charkov wrote:
> >>>>>> On Mon, May 6, 2024 at 1:37 PM Alexey Charkov <alchark@xxxxxxxxx>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> This enables thermal monitoring and CPU DVFS on RK3588(s), as well
> >> as
> >>>>>>> active cooling on Radxa Rock 5B via the provided PWM fan.
> >>>>>>>
> >>>>>>> Some RK3588 boards use separate regulators to supply CPUs and their
> >>>>>>> respective memory interfaces, so this is handled by coupling those
> >>>>>>> regulators in affected boards' device trees to ensure that their
> >>>>>>> voltage is adjusted in step.
> >>>>>>>
> >>>>>>> This also enables the built-in thermal sensor (TSADC) for all
> >> boards
> >>>>>>> that don't currently have it enabled, using the default CRU based
> >>>>>>> emergency thermal reset. This default configuration only uses
> >> on-SoC
> >>>>>>> devices and doesn't rely on any external wiring, thus it should
> >> work
> >>>>>>> for all devices (tested only on Rock 5B though).
> >>>>>>>
> >>>>>>> The boards that have TSADC_SHUT signal wired to the PMIC reset line
> >>>>>>> can choose to override the default reset logic in favour of GPIO
> >>>>>>> driven (PMIC assisted) reset, but in my testing it didn't work on
> >>>>>>> Radxa Rock 5B - maybe I'm reading the schematic wrong and it
> >> doesn't
> >>>>>>> support PMIC assisted reset after all.
> >>>>>>>
> >>>>>>> Fan control on Rock 5B has been split into two intervals: let it
> >> spin
> >>>>>>> at the minimum cooling state between 55C and 65C, and then
> >> accelerate
> >>>>>>> if the system crosses the 65C mark - thanks to Dragan for
> >> suggesting.
> >>>>>>> This lets some cooling setups with beefier heatsinks and/or larger
> >>>>>>> fan fins to stay in the quietest non-zero fan state while still
> >>>>>>> gaining potential benefits from the airflow it generates, and
> >>>>>>> possibly avoiding noisy speeds altogether for some workloads.
> >>>>>>>
> >>>>>>> OPPs help actually scale CPU frequencies up and down for both
> >> cooling
> >>>>>>> and performance - tested on Rock 5B under varied loads. I've
> >> dropped
> >>>>>>> those OPPs that cause frequency reductions without accompanying
> >>>>>>> decrease
> >>>>>>> in CPU voltage, as they don't seem to be adding much benefit in
> >> day to
> >>>>>>> day use, while the kernel log gets a number of "OPP is inefficient"
> >>>>>>> lines.
> >>>>>>>
> >>>>>>> Note that this submission doesn't touch the SRAM read margin
> >> updates
> >>>>>>> or
> >>>>>>> the OPP calibration based on silicon quality which the downstream
> >>>>>>> driver
> >>>>>>> does and which were mentioned in [1]. It works as it is (also
> >>>>>>> confirmed by
> >>>>>>> Sebastian in his follow-up message [2]), and it is stable in my
> >>>>>>> testing on
> >>>>>>> Rock 5B, so it sounds better to merge a simple version first and
> >> then
> >>>>>>> extend when/if required.
> >>>>>>>
> >>>>>>> [1]
> >>>>>>>
> >> https://lore.kernel.org/linux-rockchip/CABjd4YzTL=5S7cS8ACNAYVa730WA3iGd5L_wP1Vn9=f83RCORA@xxxxxxxxxxxxxx/
> >>>>>>> [2]
> >>>>>>>
> >> https://lore.kernel.org/linux-rockchip/pkyne4g2cln27dcdu3jm7bqdqpmd2kwkbguiolmozntjuiajrb@gvq4nupzna4o/
> >>>>>>>
> >>>>>>> Signed-off-by: Alexey Charkov <alchark@xxxxxxxxx>
> >>>>>>> ---
> >>>>>>
> >>>>>> Hi Heiko,
> >>>>>>
> >>>>>> Do you think this can be merged for 6.11? Looks like there hasn't
> >> been
> >>>>>> any new feedback in a while, and it would be good to have frequency
> >>>>>> scaling in place for RK3588.
> >>>>>>
> >>>>>> Please let me know if you have any reservations or if we need any
> >>>>>> broader discussion.
> >>>>
> >>>> not really reservations, more like there was still discussion going on
> >>>> around the OPPs. Meanwhile we had more discussions regarding the whole
> >>>> speed binning Rockchip seems to do for rk3588 variants.
> >>>>
> >>>> And waiting for the testing Dragan wanted to do ;-) .
> >>>
> >>> I'm sorry for the delays.
> >>
> >> Was definitly _not_ meant as blame ;-) .
> >>
> >> The series has just too many discussions threads to unravel on half
> >> an afternoon.
> >
> >
> > FWIW, I think the latest exchange we had with Quentin regarding the OPPs
> > concluded in “false alarm”, given that this version of the series only
> > introduces a subset of them which should apply to all RK3588(s)
> >
>
> Correct.
>
> However... I'm wondering if we shouldn't somehow follow the same pattern
> we have used for the rk3399 OPPs? We have a file for the "true" RK3399
> OPPs, then the OP1 variant and the RK3399T.
>
> We already know there are a few variants of RK3588 with different OPPs:
> RK3588(S/S2?), RK3588J and RK3588M. I wouldn't be surprised if the
> RK3582 (though this one has already one big cluster (or two big cores)
> fewer than RK3588) has different OPPs as well?
>
> So. We have already discussed that the OPPs in that patch are valid for
> RK3588(S) but they aren't for the other variants.
Hmm. Looking at Rockchip sources [1] more closely as opposed to the
Radxa tree I've been using as the basis, it does indeed show that
RK3588J and RK3588M use a different OPP table altogether (frequencies
are similar, but voltages differ).
We currently have an RK3588J based board in the mainline tree
(rk3588-edgeble-neu6b-io.dts), so it can't be ignored. However, given
that Rockchip sources only differentiate those OPPs by SoC revision,
and that is (currently?) known for each board at dtb compile time, it
seems easier to just include two different OPP tables for RK3588(S)
vs. RK3588J - thus avoiding all the headache with opp-supported-hw.
Similar to RK3399, yes.
Will split those out and send a separate version.
> In the downstream kernel, any OPP whose opp-supported-hw has a first
> value masked by BIT(1) return non-0 is supported by RK3588M. In the
> downstream kernel, any OPP whose opp-supported-hw has a first value
> masked by BIT(2) return non-0 is supported by RK3588J.
>
> This means that, for LITTLE clusters:
> - opp-1608000000 not supported on RK3588J
> - opp-1704000000 only supported on RK3588M (but already absent in this
> patch series)
> - opp-1800000000 only supported on RK3588(S), not RK3588J nor RK3588M
>
> For big clusters:
> - opp-1800000000 not supported on RK3588J
> - opp-2016000000 not supported on RK3588J
> - opp-2208000000 only supported on RK3588(S), not RK3588J nor RK3588M
> - opp-2256000000 only supported on RK3588(S), not RK3588J nor RK3588M
> - opp-2304000000 only supported on RK3588(S), not RK3588J nor RK3588M
> - opp-2352000000 only supported on RK3588(S), not RK3588J nor RK3588M
> - opp-2400000000 only supported on RK3588(S), not RK3588J nor RK3588M
>
> This is somehow also enforced in downstream kernel by removing the OPP
> nodes directly (hence, not even requiring the check of opp-supported-hw
> value), c.f.:
> https://git.theobroma-systems.com/tiger-linux.git/tree/arch/arm64/boot/dts/rockchip/rk3588j.dtsi
> https://git.theobroma-systems.com/tiger-linux.git/tree/arch/arm64/boot/dts/rockchip/rk3588m.dtsi
>
> You'll not that the RK3588J also has less OPPs for the GPU and NPU (but
> those should also be masked by the opp-supported-hw value).
Same with DMC, but we don't currently have either DMC or NPU in the
mainline tree, so it sounds like something to be dealt with later :)
Best regards,
Alexey
[1] https://github.com/rockchip-linux/kernel/blob/develop-5.10/arch/arm64/boot/dts/rockchip/rk3588s.dtsi