Re: [PATCH v2] drm/panfrost: Ignore core_mask for poweroff and sync interrupts
From: Krzysztof Kozlowski
Date: Thu Nov 23 2023 - 06:57:23 EST
On 23/11/2023 12:50, AngeloGioacchino Del Regno wrote:
> Some SoCs may be equipped with a GPU containing two core groups
> and this is exactly the case of Samsung's Exynos 5422 featuring
> an ARM Mali-T628 MP6 GPU: the support for this GPU in Panfrost
> is partial, as this driver currently supports using only one
> core group and that's reflected on all parts of it, including
> the power on (and power off, previously to this patch) function.
>
> The issue with this is that even though executing the soft reset
> operation should power off all cores unconditionally, on at least
> one platform we're seeing a crash that seems to be happening due
> to an interrupt firing which may be because we are calling power
> transition only on the first core group, leaving the second one
> unchanged, or because ISR execution was pending before entering
> the panfrost_gpu_power_off() function and executed after powering
> off the GPU cores, or all of the above.
>
> Finally, solve this by introducing a new panfrost_gpu_suspend_irq()
> helper function and changing the panfrost_device_suspend() flow to
> 1. Mask and clear all interrupts: we don't need nor want any, as
> for power_off() we are polling PWRTRANS, but we anyway don't
> want GPU IRQs to fire while it is suspended/powered off;
> 2. Call synchronize_irq() after that to make sure that any pending
> ISR is executed before powering off the GPU Shaders/Tilers/L2
> hence avoiding unpowered registers R/W; and
> 3. Ignore the core_mask and ask the GPU to poweroff both core groups
>
> Of course it was also necessary to add a `irq` variable to `struct
> panfrost_device` as we need to get that in panfrost_gpu_power_off()
> for calling synchronize_irq() on it.
>
> Fixes: 22aa1a209018 ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()")
> [Regression detected on Odroid HC1, Exynos5422, Mali-T628 MP6]
> Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@xxxxxxxxxx>
> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx>
> ---
>
> Changes in v2:
> - Fixed the commit hash of "Really power off [...]"
> - Actually based on a clean next-20231121
> - Renamed "irq" to "gpu_irq" as per Boris' suggestion
> - Moved the IRQ mask/clear/sync to a helper function and added
> a call to that in panfrost_device.c instead of doing that in
> panfrost_gpu_power_off().
>
> NOTE: I didn't split 1+2 from 3 as suggested by Boris, and I'm sending
> this one without waiting for feedback on my reasons for that which I
> explained as a reply to v1 because the former couldn't be applied to
> linux-next, and I want to unblock Krzysztof ASAP to get this tested.
>
This does not compile.
Best regards,
Krzysztof