Re: [RFC]: shmem fd for non-DMA buffer sharing cross drivers

From: Hsia-Jun Li
Date: Fri Aug 25 2023 - 03:31:43 EST




On 8/23/23 21:15, Tomasz Figa wrote:
CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.


On Wed, Aug 23, 2023 at 4:11 PM Hsia-Jun Li <Randy.Li@xxxxxxxxxxxxx> wrote:



On 8/23/23 12:46, Tomasz Figa wrote:
CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.


Hi Hsia-Jun,

On Tue, Aug 22, 2023 at 8:14 PM Hsia-Jun Li <Randy.Li@xxxxxxxxxxxxx> wrote:

Hello

I would like to introduce a usage of SHMEM slimier to DMA-buf, the major
purpose of that is sharing metadata or just a pure container for cross
drivers.

We need to exchange some sort of metadata between drivers, likes dynamic
HDR data between video4linux2 and DRM.

If the metadata isn't too big, would it be enough to just have the
kernel copy_from_user() to a kernel buffer in the ioctl code?

Or the graphics frame buffer is
too complex to be described with plain plane's DMA-buf fd.
An issue between DRM and V4L2 is that DRM could only support 4 planes
while it is 8 for V4L2. It would be pretty hard for DRM to expend its
interface to support that 4 more planes which would lead to revision of
many standard likes Vulkan, EGL.

Could you explain how a shmem buffer could be used to support frame
buffers with more than 4 planes?
If you are asking why we need this:

I'm asking how your proposal to use shmem FD solves the problem for those cases.

The shmem fd is the reference to a metadata container(A C struct in kernel). Then drivers(V4L2 and DRM) could read this metadata when it processes the major buffer(SHMEM buf is the buffer assigned with a major buffer like the graphics buffer).
1. metadata likes dynamic HDR tone data
2. DRM also challenges with this problem, let me quote what sima said:
"another trick that we iirc used for afbc is that sometimes the planes
have a fixed layout
like nv12
and so logically it's multiple planes, but you only need one plane slot
to describe the buffer
since I think afbc had the "we need more than 4 planes" issue too"

Unfortunately, there are vendor pixel formats are not fixed layout.

3. Secure(REE, trusted video piepline) info.

For how to assign such metadata data.
In case with a drm fb_id, it is simple, we just add a drm plane property
for it. The V4L2 interface is not flexible, we could only leave into
CAPTURE request_fd as a control.

Also, there is no reason to consume a device's memory for the content
that device can't read it, or wasting an entry of IOMMU for such data.

That's right, but DMA-buf doesn't really imply any of those. DMA-buf
is just a kernel object with some backing memory. It's up to the
allocator to decide how the backing memory is allocated and up to the
importer on whether it would be mapped into an IOMMU.

I just want to say it can't be allocated at the same place which was for
those DMA bufs(graphics or compressed bitstream).
This also could be answer for your first question, if we place this kind
of buffer in a plane for DMABUF(importing) in V4L2, V4L2 core would try
to prepare it, which could map it into IOMMU.


V4L2 core will prepare it according to the struct device that is given
to it. For the planes that don't have to go to the hardware a struct
device could be given that doesn't require any DMA mapping. Also you
can check how the uvcvideo driver handles it. It doesn't use the vb2
Because it uses vb2_vmalloc_memops?
That vb2_vmalloc_attach_dmabuf() won't work anything.
buffers directly, but always writes to them using CPU (due to how the
Yes I noticed it would copy UBR buffer to vb2 buffer.
UVC protocol is designed).
I don't know what stops that, because we can't assume xHCI or EHCI have the IOMMU?

I think that is not I want. If you were not talking about META_CAPTURE, which would be a ioslated dev node.
For example, we have a NV15(2 planes) buffer with its HDR data.
We need its NV15 planes be accessed by DMA directly or it would be a performance issue(so UVC memcpy is not acceptable), while its HDR data, we just read it from the devices' registers or somewhere, should be ship with the exactly buffer.

Even we could expand the vb2_mem_ops interfaces, making it know which plane(ex. plane 0, 1 are graphics plane 2 is the metadata). The purpose here it not invoke the metadata buffer with any DMA buffer procedure.
Usually, such a metadata would be the value should be written to a
hardware's registers, a 4KiB page would be 1024 items of 32 bits registers.

Still, I have some problems with SHMEM:
1. I don't want the userspace modify the context of the SHMEM allocated
by the kernel, is there a way to do so?

This is generally impossible without doing any of the two:
1) copying the contents to an internal buffer not accessible to the
userspace, OR
2) modifying any of the buffer mappings to read-only

2) can actually be more costly than 1) (depending on the architecture,
data size, etc.), so we shouldn't just discard the option of a simple
copy_from_user() in the ioctl.

I don't want the userspace access it at all. So that won't be a problem.

In this case, wouldn't it be enough to have a DMA-buf exporter that
doesn't provide the mmap op?

Also we want it allocate from vb2_mem_ops->alloc(), we could set an alloc_dev for a plane in queue_setup(), but where the metadata plane is would depends on the pixel formats.
It would be better not in vb2_planes.
2. Should I create a helper function for installing the SHMEM file as a fd?

We already have the udmabuf device [1] to turn a memfd into a DMA-buf,
so maybe that would be enough?

[1] https://urldefense.proofpoint.com/v2/url?u=https-3A__elixir.bootlin.com_linux_v6.5-2Drc7_source_drivers_dma-2Dbuf_udmabuf.c&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=Axkxip3NAR4d0Qb1sF508Emp922GJVs1gxv_UOP_DSxMMblD5iMl_F5-MiGyjiuZ&s=moL165MraNoXAqwiOfsw2qTeqLEzJ17Ut3J8DBx783c&e=

It is the kernel driver that allocate this buffer. For example, v4l2
CAPTURE allocate a buffer for metadata when VIDIOC_REQBUFS.
Or GBM give you a fd which is assigned with a surface.

So we need a kernel interface.

Sorry, I'm confused. If we're talking about buffers allocated by the
specific allocators like V4L2 or GBM, why do we need SHMEM at all?

I would be in the IRC channel 5 hours later and this weekend, if anything is confusing, we could talk there.
Best,
Tomasz

Best,
Tomasz


--
Hsia-Jun(Randy) Li

--
Hsia-Jun(Randy) Li

--
Hsia-Jun(Randy) Li