Re: [PATCH] media: mtk-jpeg: Fix use after free bug due to uncanceled work

From: Guenter Roeck
Date: Thu Mar 09 2023 - 00:31:34 EST


On 3/8/23 19:58, Zheng Hacker wrote:
Hi,

Thanks for your reply. I think you're right. I don't know if there is
other method to stop new work from enqueing. Could you please give me
some advice about the fix?


Top-posting is discouraged.

Anyway -
I don't know the code well enough to suggest a solution.
It all depends on the driver architecture. The maintainers might
have a better idea.

A worse problem appears to be that the worker is also canceled
from mtk_jpeg_enc_irq() and mtk_jpeg_dec_irq(). Those are non-threaded
interrupt handlers which, as far as I know, must not sleep and thus
can not call cancel_delayed_work_sync(). I have no idea how to solve
that problem either.

Guenter

Regards,
Zheng

Guenter Roeck <linux@xxxxxxxxxxxx> 于2023年3月9日周四 08:27写道:

On Thu, Mar 02, 2023 at 05:37:15PM +0800, Zheng Wang wrote:
In mtk_jpeg_probe, &jpeg->job_timeout_work is bound with
mtk_jpeg_job_timeout_work. Then mtk_jpeg_dec_device_run
and mtk_jpeg_enc_device_run may be called to start the
work.
If we remove the module which will call mtk_jpeg_remove
to make cleanup, there may be a unfinished work. The
possible sequence is as follows, which will cause a
typical UAF bug.

Fix it by canceling the work before cleanup in the mtk_jpeg_remove

CPU0 CPU1

|mtk_jpeg_job_timeout_work
mtk_jpeg_remove |
v4l2_m2m_release |
kfree(m2m_dev); |
|
| v4l2_m2m_get_curr_priv
| m2m_dev->curr_ctx //use

Signed-off-by: Zheng Wang <zyytlz.wz@xxxxxxx>
---
drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
index 969516a940ba..364513e7897e 100644
--- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
+++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
@@ -1793,7 +1793,7 @@ static int mtk_jpeg_probe(struct platform_device *pdev)
static int mtk_jpeg_remove(struct platform_device *pdev)
{
struct mtk_jpeg_dev *jpeg = platform_get_drvdata(pdev);
-
+ cancel_delayed_work(&jpeg->job_timeout_work);

The empty line is needed (coding style). Also, this doesn't cancel
the worker if it is already running. This should probably be
cancel_delayed_work_sync(). Even then the question is if it is
possible that new work is queued before the device is unregistered.

Guenter

pm_runtime_disable(&pdev->dev);
video_unregister_device(jpeg->vdev);
v4l2_m2m_release(jpeg->m2m_dev);
--
2.25.1