Re: [PATCH v2 0/6] Fix RK3588 GPU domain

From: Ulf Hansson
Date: Wed Oct 02 2024 - 07:01:39 EST


On Thu, 19 Sept 2024 at 11:18, Sebastian Reichel
<sebastian.reichel@xxxxxxxxxxxxx> wrote:
>
> Hi,
>
> I got a report, that the Linux kernel crashes on Rock 5B when the panthor
> driver is loaded late after booting. The crash starts with the following
> shortened error print:
>
> rockchip-pm-domain fd8d8000.power-management:power-controller: failed to set domain 'gpu', val=0
> rockchip-pm-domain fd8d8000.power-management:power-controller: failed to get ack on domain 'gpu', val=0xa9fff
> SError Interrupt on CPU4, code 0x00000000be000411 -- SError
>
> This series first does some cleanups in the Rockchip power domain
> driver and changes the driver, so that it no longer tries to continue
> when it fails to enable a domain. This gets rid of the SError interrupt
> and long backtraces. But the kernel still hangs when it fails to enable
> a power domain. I have not done further analysis to check if that can
> be avoided.
>
> Last but not least this provides a fix for the GPU power domain failing
> to get enabled - after some testing from my side it seems to require the
> GPU voltage supply to be enabled.
>
> I'm not really happy about the hack to get a regulator for a sub-node,
> which I took over from the Mediatek driver. I discussed this with
> Chen-Yu Tsai and Heiko Stübner at OSS EU and the plan is:
>
> 1. Merge Rockchip PM domain driver with this hack for now, since DRM CI
> people need it
> 2. Chen-Yu will work on a series, which fixes the hack in Mediatek by
> introducing a new devm_regulator_get function taking an DT node as
> additional argument
> 3. Rockchip PM domain later will switch to that once it has landed

I have just queued up 2) on my next branch.

My suggestion is to skip the intermediate step in 1) and go directly
for 3) instead, unless you think there is a problem with that, of
course?

[...]

Kind regards
Uffe