Re: [PATCH] drm/msm/adreno: adreno_gpu: Use suspend() instead of idle() on load error
From: Konrad Dybcio
Date: Wed Mar 29 2023 - 11:49:04 EST
On 29.03.2023 16:37, Johan Hovold wrote:
> On Wed, Mar 29, 2023 at 04:04:44PM +0200, Konrad Dybcio wrote:
>> If we fail to initialize the GPU for whatever reason (say we don't
>> embed the GPU firmware files in the initrd), the error path involves
>> pm_runtime_put_sync() which then calls idle() instead of suspend().
>>
>> This is suboptimal, as it means that we're not going through the
>> clean shutdown sequence. With at least A619_holi, this makes the GPU
>> not wake up until it goes through at least one more start-fail-stop
>> cycle. Fix that by using pm_runtime_put_sync_suspend to force a clean
>> shutdown.
>
> This does not sound right. If pm_runtime_put_sync() fails to suspend the
> device when the usage count drops to zero, then you have a bug somewhere
> else.
I was surprised to see that it was not called as well, but I wasn't able
to track it down before..
>
> Also since commit 2c087a336676 ("drm/msm/adreno: Load the firmware
> before bringing up the hardware") the firmware is loaded before even
> hitting these paths so the above description does not sound right in
> that respect either (or is missing some details).
..but I did some more digging and I found that the precise "firmware"
that fails is the ZAP blob, which is not checked like SQE in the
commit you mentioned!
Now I don't think that we can easily check for it as-is since
zap_shader_load_mdt() does the entire find-load-authenticate
dance which is required with secure assets, but it's obviously
possible to rip out the find-load part of that and go on from
there.
Do you think that would be a better solution?
Konrad
>
>> Test cases:
>> 1. firmware baked into kernel
>> 2. error loading fw in initrd -> load from rootfs at DE start
>>
>> Both succeed on A619_holi (SM6375) and A630 (SDM845).
>>
>> Fixes: 0d997f95b70f ("drm/msm/adreno: fix runtime PM imbalance at gpu load")
>> Signed-off-by: Konrad Dybcio <konrad.dybcio@xxxxxxxxxx>
>> ---
>> drivers/gpu/drm/msm/adreno/adreno_device.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
>> index f61896629be6..59f3302e8167 100644
>> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
>> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
>> @@ -477,7 +477,7 @@ struct msm_gpu *adreno_load_gpu(struct drm_device *dev)
>> return gpu;
>>
>> err_put_rpm:
>> - pm_runtime_put_sync(&pdev->dev);
>> + pm_runtime_put_sync_suspend(&pdev->dev);
>> err_disable_rpm:
>> pm_runtime_disable(&pdev->dev);
>
> Johan