Re: [PATCH 2/2] media: verisilicon: Avoid G2 bus error while decoding H.264 and HEVC
From: Ming Qian(OSS)
Date: Mon Nov 24 2025 - 21:40:29 EST
Hi Nicolas and Frank,
On 11/25/2025 12:55 AM, Nicolas Dufresne wrote:
Le lundi 24 novembre 2025 à 11:39 -0500, Nicolas Dufresne a écrit :
Hi,
Le lundi 24 novembre 2025 à 10:49 -0500, Frank Li a écrit :
On Mon, Nov 24, 2025 at 09:38:15AM +0800, Ming Qian(OSS) wrote:
Hi Frank,
On 11/22/2025 12:08 AM, Frank Li wrote:
On Fri, Nov 21, 2025 at 04:19:09PM +0800, ming.qian@xxxxxxxxxxx wrote:
From: Ming Qian <ming.qian@xxxxxxxxxxx>...
For the i.MX8MQ platform, there is a hardware limitation: the g1 VPU and
g2 VPU cannot decode simultaneously; otherwise, it will cause below bus
error and produce corrupted pictures, even led to system hang.
[ 110.527986] hantro-vpu 38310000.video-codec: frame decode timed out.
[ 110.583517] hantro-vpu 38310000.video-codec: bus error detected.
Therefore, it is necessary to ensure that g1 and g2 operate alternately.
Then this allows for successful multi-instance decoding of H.264 and
HEVC.
Fixes: cb5dd5a0fa518 ("media: hantro: Introduce G2/HEVC decoder")
Signed-off-by: Ming Qian <ming.qian@xxxxxxxxxxx>
---
drivers/media/platform/verisilicon/hantro.h | 1 +
.../media/platform/verisilicon/hantro_drv.c | 26 +++++++++++++++++++
.../media/platform/verisilicon/imx8m_vpu_hw.c | 4 +++
3 files changed, 31 insertions(+)
#include <linux/workqueue.h>
+#include <linux/iopoll.h>
#include <media/v4l2-event.h>
#include <media/v4l2-mem2mem.h>
#include <media/videobuf2-core.h>
@@ -93,6 +94,9 @@ static void hantro_job_finish(struct hantro_dev *vpu,
clk_bulk_disable(vpu->variant->num_clocks, vpu->clocks);
+ if (vpu->variant->shared_resource)
+ atomic_cmpxchg(vpu->variant->shared_resource, 0, 1);
+
hantro_job_finish_no_pm(vpu, ctx, result);
}
@@ -166,12 +170,34 @@ void hantro_end_prepare_run(struct hantro_ctx
*ctx)
msecs_to_jiffies(2000));
}
+static int hantro_wait_shared_resource(struct hantro_dev *vpu)
+{
+ u32 data;
+ int ret;
+
+ if (!vpu->variant->shared_resource)
+ return 0;
+
+ ret = read_poll_timeout(atomic_cmpxchg, data, data, 10, 300 *
NSEC_PER_MSEC, false,
+ vpu->variant->shared_resource, 1, 0);
+ if (ret) {
+ dev_err(vpu->dev, "Failed to wait shared resource\n");
+ return -EINVAL;
+ }
why not use a mutex?
mutex() lock here, unlock at hantro_job_finish(), if second instance
run to here, mutex() will block thread, until previous hantro_job_finish()
finish.
Frank
G1 and G2 are two different devices. If I were to use a mutex, I would
need to define a global mutex. Therefore, to avoid using a global mutex,
I only define a static atomic variable.
static atomic varible also is global. Global mutex is allowed if it is
really needed.
If a static mutex is acceptable, I think I can change it to a mutex.
ref to
https://elixir.bootlin.com/linux/v6.18-rc6/source/drivers/base/core.c#L43
My main concern with either of these approaches is that it kills the ability to
rmmod the driver forever. The only working approach would be that both drivers
depends on a third driver that provide the synchronization services.
I do realize after the fact that my answer is a little off considering its a
drivers against itself (not cross-driver, that would be a huge pain if it was
the case).
Checking further, the ref to the counter (or mutex) should cleanly be gone by
the time the driver is removed, so perhaps its fine, though best to test it.
Though, in both cases, I'm not happy to see code that will wait for multiple
milliseconds on either home made mutex or a real mutex. Adding another arbitrary
timeout is also not very nice either. The current software watchdog already get
in the way when testing simulated IP.
I know its work, but what about a recounted singleton, with a notifier so we can
schedule work when the resource is free ?
Nicolas
Thank you for your comments. I will consider a better solution.
Regards,
Ming
Nicolas
Frank
Regards,
Ming
+...
+ return 0;
+}
+
static void device_run(void *priv)
{
struct hantro_ctx *ctx = priv;
struct vb2_v4l2_buffer *src, *dst;
int ret;
+ ret = hantro_wait_shared_resource(ctx->dev);
+ if (ret < 0)
+ goto err_cancel_job;
+
src = hantro_get_src_buf(ctx);
dst = hantro_get_dst_buf(ctx);
--
2.34.1