Re: Hardware-accelerated video decoders used through a firmware instead of hardware registers

From: Nicolas Dufresne
Date: Sun May 12 2019 - 10:19:21 EST

Le dimanche 12 mai 2019 Ã 13:35 +0200, Paul Kocialkowski a Ãcrit :
> Hi,
> With the work done on the media request API and the cedrus driver for
> Allwinner ARM SoCs, we now have a kernel interface for exposing fixed-
> hardware video decoding pipelines (currently MPEG-2 and H.264, with
> H.265 on the way). Some work remains on the per-format interface and we
> are looking to improve latency-related aspects, but we are all set to
> have a nice interface here, that plays well with e.g. ffmpeg.
> A specific situation came to my interest, which is apparently quite
> common: some platforms have general-purpose microcontrollers embedded,
> which can help with video decoding. They are however rarely to never
> used to do the decoding itself (since they are general-purpose, not
> DSPs) and just coordinate the decoding with the fixed-pipeline decoding
> hardware block. The advantage is that the interface is just a simple
> mailbox and the raw video bitstream from the file can be passed
> directly without the need for userspace to do any parsing that the
> codec requires.
> One side-effect from this setup is that the actual hardware register
> layout of the decoder is hidden away in a non-free piece of
> microcontroller firmware, that's usually loaded at run-time.
> With the recent developments on the media interface, we could interface
> with these hardware decoders directly, which offers various advantages:
> - we no longer need a 3rd party external non-free firmware, which just
> makes distribution easier for everyone and allows support in fully-
> free setups;
> - all the usual advantages of having free code that can be fixed and
> updated instead of an obscure binary that many not always be doing
> the right thing;
> - parsing of the slices is probably best done in userspace, and I
> heard that ffmpeg does this threaded, so there could be a latency
> advantage there as well, not to mention that it avoids the drag of
> a mailbox interface altogether;
> - the general-purpose micro-controller can then be reused for something
> useful where it could actually make a performance difference.
> As far as I understand, it seems that the video decoder for MT8173
> fails in that category, where a MD32 general-purpose micro-controller
> is used to only do the parsing. We even have device-tree nodes about
> the decoder and encoder, but no register layout.
> So I was wondering if the linux-media community should set some
> boundaries here and push towards native implementations instead of
> firmware-based ones. My opininon is that it definitely should.
> It seems that other platforms (e.g. Tegra K1 and onwards) are in the
> same situation, and I think the ChromiumOS downstream kernel uses an
> obscure firmware on a general-purpose auxiliary ARM core (that's also
> used at boot time IIRC).

I like the idea, but enforcing this now is likely going to prevent a
lot of mainline usage of CODECs (which are proprietary by patents to
start with). One thing to note, the CODEC accelerators may not be
accessible from CPU. So to support such idea, we'd need to develop
minimalist firmware to access these accelerators. That would require a
lot of reverse engineering as the third party codec vendors (e.g.
Chips&Media, Allegro etc.) don't document the accelerator or even the
architecture of the micro-controller. Compilation of these firmware can
also become tedious, specially if there is no Open Source compiler for
the chosen micro-controller architecture.

I can comment on ChromeOS, current generation is mostly based on
Rockchip SoC. The CODEC on Rockchip are just accelators, and this is
what ChromeOS team implemented, and that's what the stateless you have
done is based upon. The first generation was Samsung Exynos, this one
uses a unknown source design that they call MFC. This runs on
proprietary blob, I have not found any information about this blob.

The early boot stage is not obscure, it's called CoreBoot. This code is
meant to initialize your CPU when you CPU isn't started yet. Notably on
Intel, there has been a lot of security concerns with this proprietary
blob, CoreBoot effort includes reverse engineering and replacing this
bit. At least on Intel blobs, the micro-controller is still running
after your main CPU is loaded, giving attackers a place to run with
true full access to your computer, without being detectable.

On some platforms it can be even more complex. Think of the Xilinx
ZynMP. Documentation is pretty sparse, it's clear the VCU is only
accessible from the FPGA, and that's probably why we need a MicroBlaze
firmware (micro blaze being a micro-controller architecture programmed
into some part of a Xilinx FPGA) in order to use it. But then, it is
not clear if the VCU is fully capable of decoding, or if the work is a
mix of FPA and circuit. So replacing the firmware could be the same as
rewriting the CODEC HW (or at least some bits of it).

> What do you think?
> Cheers,
> Paul