[PATCH] media: chips-media: wave5: Call v4l2_m2m_job_finish() in job_abort()

From: Mattijs Korpershoek
Date: Wed Feb 28 2024 - 11:14:55 EST


When aborting a stream, it's possible that v4l2_m2m_cancel_job()
remains stuck:

[ 3498.490038][ T1] sysrq: Show Blocked State
[ 3498.511754][ T1] task:V4L2DecodeCompo state:D stack:12480 pid:2387 ppid:1 flags:0x04000809
[ 3498.521153][ T1] Call trace:
[ 3498.524333][ T1] __switch_to+0x174/0x338
[ 3498.528611][ T1] __schedule+0x5ec/0x9cc
[ 3498.532795][ T1] schedule+0x84/0xf0
[ 3498.536630][ T1] v4l2_m2m_cancel_job+0x118/0x194
[ 3498.541595][ T1] v4l2_m2m_streamoff+0x34/0x13c
[ 3498.546384][ T1] v4l2_m2m_ioctl_streamoff+0x20/0x30
[ 3498.551607][ T1] v4l_streamoff+0x44/0x54
[ 3498.555877][ T1] __video_do_ioctl+0x388/0x4dc
[ 3498.560580][ T1] video_usercopy+0x618/0xa0c
[ 3498.565109][ T1] video_ioctl2+0x20/0x30
[ 3498.569292][ T1] v4l2_ioctl+0x74/0x8c
[ 3498.573300][ T1] __arm64_sys_ioctl+0xb0/0xec
[ 3498.577918][ T1] invoke_syscall+0x60/0x124
[ 3498.582368][ T1] el0_svc_common+0x90/0xfc
[ 3498.586724][ T1] do_el0_svc+0x34/0xb8
[ 3498.590733][ T1] el0_svc+0x2c/0xa4
[ 3498.594480][ T1] el0t_64_sync_handler+0x68/0xb4
[ 3498.599354][ T1] el0t_64_sync+0x1a4/0x1a8
[ 3498.603832][ T1] sysrq: Kill All Tasks

According to job_abort() documentation from v4l2_m2m_ops:

After the driver performs the necessary steps, it has to call
v4l2_m2m_job_finish() or v4l2_m2m_buf_done_and_job_finish() as
if the transaction ended normally.

This is not done in wave5_vpu_dec_job_abort(). Neither switch_state() nor
wave5_vpu_dec_set_eos_on_firmware() does this.

Add the missing call to fix the v4l2_m2m_cancel_job() hangs.

Fixes: 9707a6254a8a ("media: chips-media: wave5: Add the v4l2 layer")
Signed-off-by: Mattijs Korpershoek <mkorpershoek@xxxxxxxxxxxx>
---
This has been tested on the AM62Px SK EVM using Android 14.
See:
https://www.ti.com/tool/PROCESSOR-SDK-AM62P

Note: while this is has not been tested on an upstream linux tree, I
believe the fix is still valid for mainline and I would like to get
some feedback on it.

Thank you in advance for your time.
---
drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
index ef227af72348..b03e3633a1bc 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
@@ -1715,6 +1715,7 @@ static void wave5_vpu_dec_device_run(void *priv)
static void wave5_vpu_dec_job_abort(void *priv)
{
struct vpu_instance *inst = priv;
+ struct v4l2_m2m_ctx *m2m_ctx = inst->v4l2_fh.m2m_ctx;
int ret;

ret = switch_state(inst, VPU_INST_STATE_STOP);
@@ -1725,6 +1726,8 @@ static void wave5_vpu_dec_job_abort(void *priv)
if (ret)
dev_warn(inst->dev->dev,
"Setting EOS for the bitstream, fail: %d\n", ret);
+
+ v4l2_m2m_job_finish(inst->v4l2_m2m_dev, m2m_ctx);
}

static int wave5_vpu_dec_job_ready(void *priv)

---
base-commit: 8c64f4cdf4e6cc5682c52523713af8c39c94e6d5
change-id: 20240228-wave5-fix-abort-f72d25881cbd

Best regards,
--
Mattijs Korpershoek <mkorpershoek@xxxxxxxxxxxx>