RE: [v1] drm/msm: add null checks for drm device to avoid crash during probe defer
From: Vinod Polimera
Date: Fri Jun 03 2022 - 06:55:28 EST
> -----Original Message-----
> From: Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx>
> Sent: Friday, June 3, 2022 3:07 PM
> To: Vinod Polimera (QUIC) <quic_vpolimer@xxxxxxxxxxx>; dri-
> devel@xxxxxxxxxxxxxxxxxxxxx; linux-arm-msm@xxxxxxxxxxxxxxx;
> freedreno@xxxxxxxxxxxxxxxxxxxxx; devicetree@xxxxxxxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx; robdclark@xxxxxxxxx;
> dianders@xxxxxxxxxxxx; vpolimer@xxxxxxxxxxx; swboyd@xxxxxxxxxxxx;
> kalyant@xxxxxxxxxxx
> Subject: Re: [v1] drm/msm: add null checks for drm device to avoid crash
> during probe defer
>
> WARNING: This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> On 03/06/2022 12:22, Vinod Polimera wrote:
> > During probe defer, drm device is not initialized and an external
> > trigger to shutdown is trying to clean up drm device leading to crash.
> > Add checks to avoid drm device cleanup in such cases.
> >
> > BUG: unable to handle kernel NULL pointer dereference at virtual
> > address 00000000000000b8
> >
> > Call trace:
> >
> > drm_atomic_helper_shutdown+0x44/0x144
> > msm_pdev_shutdown+0x2c/0x38
> > platform_shutdown+0x2c/0x38
> > device_shutdown+0x158/0x210
> > kernel_restart_prepare+0x40/0x4c
> > kernel_restart+0x20/0x6c
> > __arm64_sys_reboot+0x194/0x23c
> > invoke_syscall+0x50/0x13c
> > el0_svc_common+0xa0/0x17c
> > do_el0_svc_compat+0x28/0x34
> > el0_svc_compat+0x20/0x70
> > el0t_32_sync_handler+0xa8/0xcc
> > el0t_32_sync+0x1a8/0x1ac
> >
> > Signed-off-by: Vinod Polimera <quic_vpolimer@xxxxxxxxxxx>
>
> Fixes ?
- Added fixes tag in v2.
>
> > ---
> > drivers/gpu/drm/msm/msm_drv.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_drv.c
> b/drivers/gpu/drm/msm/msm_drv.c
> > index 4448536..d62ac66 100644
> > --- a/drivers/gpu/drm/msm/msm_drv.c
> > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > @@ -142,6 +142,9 @@ static void msm_irq_uninstall(struct drm_device
> *dev)
> > struct msm_drm_private *priv = dev->dev_private;
> > struct msm_kms *kms = priv->kms;
> >
> > + if (!irq_has_action(kms->irq))
> > + return;
> > +
>
> Is this part required with
> https://patchwork.freedesktop.org/patch/485422/?series=103702&rev=1?
Yes, I feel like this is a better approach than maintaining a new variable. I see a couple of drivers following similar approach to safeguard uninstall without being install called.
>
> > kms->funcs->irq_uninstall(kms);
> > if (kms->irq_requested)
> > free_irq(kms->irq, dev);
> > @@ -259,6 +262,7 @@ static int msm_drm_uninit(struct device *dev)
> >
> > ddev->dev_private = NULL;
> > drm_dev_put(ddev);
> > + priv->dev = NULL;
>
> What are you trying to protect here?
If we get a shutdown call after probe defer, there can be stale pointer in priv->dev which is invalid that needs to be cleared.
>
> >
> > destroy_workqueue(priv->wq);
> >
> > @@ -1167,7 +1171,7 @@ void msm_drv_shutdown(struct platform_device
> *pdev)
> > struct msm_drm_private *priv = platform_get_drvdata(pdev);
> > struct drm_device *drm = priv ? priv->dev : NULL;
> >
> > - if (!priv || !priv->kms)
> > + if (!priv || !priv->kms || !drm)
> > return;
> >
> > drm_atomic_helper_shutdown(drm);
>
>
> --
> With best wishes
> Dmitry