RE: [PATCH] drm/exynos: fix race condition UAF in exynos_g2d_exec_ioctl

From: 대인기/Tizen Platform Lab(SR)/삼성전자
Date: Thu Jun 01 2023 - 21:21:07 EST


Andi~ :)

> -----Original Message-----
> From: Andi Shyti <andi.shyti@xxxxxxxxxx>
> Sent: Thursday, June 1, 2023 5:29 PM
> To: 대인기/Tizen Platform Lab(SR)/삼성전자 <inki.dae@xxxxxxxxxxx>
> Cc: 'lm0963' <lm0963hack@xxxxxxxxx>; sw0312.kim@xxxxxxxxxxx;
> kyungmin.park@xxxxxxxxxxx; airlied@xxxxxxxxx; daniel@xxxxxxxx;
> krzysztof.kozlowski@xxxxxxxxxx; alim.akhtar@xxxxxxxxxxx; dri-
> devel@xxxxxxxxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-
> samsung-soc@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH] drm/exynos: fix race condition UAF in
> exynos_g2d_exec_ioctl
>
> Hi Inki,
>
> > > > > > > > If it is async, runqueue_node is freed in
> g2d_runqueue_worker on
> > > another
> > > > > > > > worker thread. So in extreme cases, if g2d_runqueue_worker
> runs
> > > first, and
> > > > > > > > then executes the following if statement, there will be use-
> > > after-free.
> > > > > > > >
> > > > > > > > Signed-off-by: Min Li <lm0963hack@xxxxxxxxx>
> > > > > > > > ---
> > > > > > > > drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2 +-
> > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > > > index ec784e58da5c..414e585ec7dd 100644
> > > > > > > > --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > > > +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > > > @@ -1335,7 +1335,7 @@ int exynos_g2d_exec_ioctl(struct
> > > drm_device *drm_dev, void *data,
> > > > > > > > /* Let the runqueue know that there is work to do. */
> > > > > > > > queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> > > > > > > >
> > > > > > > > - if (runqueue_node->async)
> > > > > > > > + if (req->async)
> > > > > > >
> > > > > > > did you actually hit this? If you did, then the fix is not OK.
> > > > > >
> > > > > > No, I didn't actually hit this. I found it through code review.
> This
> > > > > > is only a theoretical issue that can only be triggered in
> extreme
> > > > > > cases.
> > > > >
> > > > > first of all runqueue is used again two lines below this, which
> > > > > means that if you don't hit the uaf here you will hit it
> > > > > immediately after.
> > > >
> > > > No, if async is true, then it will goto out, which will directly
> return.
> > > >
> > > > if (runqueue_node->async)
> > > > goto out; // here, go to out, will directly return
> > > >
> > > > wait_for_completion(&runqueue_node->complete); // not hit
> > > > g2d_free_runqueue_node(g2d, runqueue_node);
> > > >
> > > > out:
> > > > return 0;
> > >
> > > that's right, sorry, I misread it.
> > >
> > > > > Second, if runqueue is freed, than we need to remove the part
> > > > > where it's freed because it doesn't make sense to free runqueue
> > > > > at this stage.
> > > >
> > > > It is freed by g2d_free_runqueue_node in g2d_runqueue_worker
> > > >
> > > > static void g2d_runqueue_worker(struct work_struct *work)
> > > > {
> > > > ......
> > > > if (runqueue_node) {
> > > > pm_runtime_mark_last_busy(g2d->dev);
> > > > pm_runtime_put_autosuspend(g2d->dev);
> > > >
> > > > complete(&runqueue_node->complete);
> > > > if (runqueue_node->async)
> > > > g2d_free_runqueue_node(g2d, runqueue_node); //
freed
> here
> > >
> > > this is what I'm wondering: is it correct to free a resource
> > > here? The design looks to me a bit fragile and prone to mistakes.
> >
> > This question seems to deviate from the purpose of this patch. If you
> are providing additional opinions for code quality improvement unrelated
> to this patch, it would be more appropriate for me to answer instead of
> him.
>
> It's not deviating as the question was already made in my first
> review. It just looks strange to me that a piece of data shared
> amongst processes can be freed up without sinchronizing. A bunch

I believe that if we overlook any doubts or concerns about worrisome
aspects without completely resolving them, it wouldn't be helpful to the
community.
Therefore, I would like to clarify more explicitly in order to ensure a
better understanding.

AFAIK, the data you mentioned isn't shared between processes. This data is
generated driver-internally when the user makes a rendering request and
will be removed once the 2D GPU finishes rendering.


However, there may be another issue that I'm not aware of, so if there is
any, give me it more specifically as it would help improve driver stability.

Thanks again,
Inki Dae

> of if's do not make it robust enough.
>
> The patch itself, in my point of view, is not really fixing much
> and won't make any difference, it's just exposing the weakness I
> mentioned.
>
> However, honestly speaking, I don't know the driver well enough
> to suggest architectural changes and that's why I r-b'ed this
> one. But the first thing that comes to my mind, without looking
> much at the code, is using kref's as a way to make sure that a
> resource doesn't magically disappear under your nose.
>
> But, of course, this is up to you and if in your opinion this is
> OK and it fixes it... then you definitely know better :)
>
> Thanks for this discussion,
> Andi
>
> > The runqueue node - which contains command list for g2d rendering - is
> generated when the user calls the ioctl system call. Therefore, if the
> user-requested command list is rendered by g2d device then there is no
> longer a reason to keep it. :)
> >
> > >
> > > The patch per se is OK. It doesn't make much difference to me
> > > where you actually read async, although this patch looks a bit
> > > safer:
> > >
> > > Reviewed-by: Andi Shyti <andi.shyti@xxxxxxxxxx>
> >
> > Thanks,
> > Inki Dae
> >
> > >
> > > However some refactoring might be needed to make it a bit more
> > > robust.
> > >
> > > Thanks,
> > > Andi
> > >
> > > > }
> > > >
> > > > >
> > > > > Finally, can you elaborate on the code review that you did so
> > > > > that we all understand it?
> > > >
> > > > queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> > > > msleep(100); // add sleep here to let g2d_runqueue_worker run
> first
> > > > if (runqueue_node->async)
> > > > goto out;
> > > >
> > > >
> > > > >
> > > > > Andi
> > > >
> > > >
> > > >
> > > > --
> > > > Min Li
> >
> >