Re: [PATCH] drm/amdgpu: Fix resource leak on probe error path

From: Jiri Kosina
Date: Thu Jul 01 2021 - 04:32:49 EST


On Thu, 24 Jun 2021, Jiri Kosina wrote:

> From: Jiri Kosina <jkosina@xxxxxxx>
>
> This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980.
>
> It is not true (as stated in the reverted commit changelog) that we never
> unmap the BAR on failure; it actually does happen properly on
> amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() ->
> amdgpu_device_fini() error path.
>
> What's worse, this commit actually completely breaks resource freeing on
> probe failure (like e.g. failure to load microcode), as
> amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too
> early, leaving all the resources that'd normally be freed in
> amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading
> to all sorts of oopses when someone tries to, for example, access the
> sysfs and procfs resources which are still around while the driver is
> gone.
>
> Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
> Reported-by: Vojtech Pavlik <vojtech@xxxxxx>
> Signed-off-by: Jiri Kosina <jkosina@xxxxxxx>

Friendly ping on this one (as it's easily triggerable in the wild by just
missing proper firwmare).

Thanks.

> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++------
> 1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 57ec108b5972..0f1c0e17a587 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3414,13 +3414,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> r = amdgpu_device_get_job_timeout_settings(adev);
> if (r) {
> dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n");
> - goto failed_unmap;
> + return r;
> }
>
> /* early init functions */
> r = amdgpu_device_ip_early_init(adev);
> if (r)
> - goto failed_unmap;
> + return r;
>
> /* doorbell bar mapping and doorbell index init*/
> amdgpu_device_doorbell_init(adev);
> @@ -3646,10 +3646,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> failed:
> amdgpu_vf_error_trans_all(adev);
>
> -failed_unmap:
> - iounmap(adev->rmmio);
> - adev->rmmio = NULL;
> -
> return r;
> }
>
> --
> 2.12.3
>
>

--
Jiri Kosina
SUSE Labs