Re: [Xen-devel] [PATCH 0/1] drm/xen-zcopy: Add Xen zero-copy helper DRM driver

From: Oleksandr Andrushchenko
Date: Tue Apr 24 2018 - 06:14:48 EST

On 04/24/2018 01:01 PM, Wei Liu wrote:
On Tue, Apr 24, 2018 at 11:08:41AM +0200, Juergen Gross wrote:
On 24/04/18 11:03, Oleksandr Andrushchenko wrote:
On 04/24/2018 11:40 AM, Juergen Gross wrote:
On 24/04/18 10:07, Oleksandr Andrushchenko wrote:
On 04/24/2018 10:51 AM, Juergen Gross wrote:
On 24/04/18 07:43, Oleksandr Andrushchenko wrote:
On 04/24/2018 01:41 AM, Boris Ostrovsky wrote:
On 04/23/2018 08:10 AM, Oleksandr Andrushchenko wrote:
On 04/23/2018 02:52 PM, Wei Liu wrote:
On Fri, Apr 20, 2018 at 02:25:20PM +0300, Oleksandr Andrushchenko
ÂÂÂÂÂÂÂ the gntdev.

I think this is generic enough that it could be implemented by a
device not tied to Xen. AFAICT the hyper_dma guys also wanted
something similar to this.
You can't just wrap random userspace memory into a dma-buf. We've
just had
this discussion with kvm/qemu folks, who proposed just that, and
after a
bit of discussion they'll now try to have a driver which just
wraps a
memfd into a dma-buf.
So, we have to decide either we introduce a new driver
(say, under drivers/xen/xen-dma-buf) or extend the existing
gntdev/balloon to support dma-buf use-cases.

Can anybody from Xen community express their preference here?

Oleksandr talked to me on IRC about this, he said a few IOCTLs
need to
be added to either existing drivers or a new driver.

I went through this thread twice and skimmed through the relevant
documents, but I couldn't see any obvious pros and cons for either
approach. So I don't really have an opinion on this.

But, assuming if implemented in existing drivers, those IOCTLs
need to
be added to different drivers, which means userspace program
needs to
write more code and get more handles, it would be slightly
better to
implement a new driver from that perspective.
If gntdev/balloon extension is still considered:

All the IOCTLs will be in gntdev driver (in current xen-zcopy

Balloon driver extension, which is needed for contiguous/DMA
buffers, will be to provide new *kernel API*, no UAPI is needed.

So I am obviously a bit late to this thread, but why do you need
to add
new ioctls to gntdev and balloon? Doesn't this driver manage to do
you want without any extensions?
1. I only (may) need to add IOCTLs to gntdev
2. balloon driver needs to be extended, so it can allocate
contiguous (DMA) memory, not IOCTLs/UAPI here, all lives
in the kernel.
3. The reason I need to extend gnttab with new IOCTLs is to
provide new functionality to create a dma-buf from grant references
and to produce grant references for a dma-buf. This is what I have as
description for xen-zcopy driver:

This will create a DRM dumb buffer from grant references provided
by the frontend. The intended usage is:
ÂÂÂ - Frontend
ÂÂÂÂÂ - creates a dumb/display buffer and allocates memory
ÂÂÂÂÂ - grants foreign access to the buffer pages
ÂÂÂÂÂ - passes granted references to the backend
ÂÂÂ - Backend
ÂÂÂÂÂ - issues DRM_XEN_ZCOPY_DUMB_FROM_REFS ioctl to map
ÂÂÂÂÂÂÂ granted references and create a dumb buffer
ÂÂÂÂÂ - requests handle to fd conversion via
ÂÂÂÂÂ - requests real HW driver/consumer to import the PRIME buffer
ÂÂÂÂÂ - uses handle returned by the real HW driver
ÂÂÂ - at the end:
ÂÂÂÂÂ o closes real HW driver's handle with DRM_IOCTL_GEM_CLOSE
ÂÂÂÂÂ o closes zero-copy driver's handle with DRM_IOCTL_GEM_CLOSE
ÂÂÂÂÂ o closes file descriptor of the exported buffer

This will grant references to a dumb/display buffer's memory
provided by
backend. The intended usage is:
ÂÂÂ - Frontend
ÂÂÂÂÂ - requests backend to allocate dumb/display buffer and grant
ÂÂÂÂÂÂÂ to its pages
ÂÂÂ - Backend
ÂÂÂÂÂ - requests real HW driver to create a dumb with
ÂÂÂÂÂ - requests handle to fd conversion via
ÂÂÂÂÂ - requests zero-copy driver to import the PRIME buffer with
ÂÂÂÂÂÂÂ grant references to the buffer's memory.
ÂÂÂÂÂ - passes grant references to the frontend
ÂÂÂ- at the end:
ÂÂÂÂÂ - closes zero-copy driver's handle with DRM_IOCTL_GEM_CLOSE
ÂÂÂÂÂ - closes real HW driver's handle with DRM_IOCTL_GEM_CLOSE
ÂÂÂÂÂ - closes file descriptor of the imported buffer

This will block until the dumb buffer with the wait handle provided be
this is needed for synchronization between frontend and backend in
frontend provides grant references of the buffer via
DRM_XEN_ZCOPY_DUMB_FROM_REFS IOCTL and which must be released before
backend replies with XENDISPL_OP_DBUF_DESTROY response.
wait_handle must be the same value returned while calling

So, as you can see the above functionality is not covered by the
existing UAPI
of the gntdev driver.
Now, if we change dumb -> dma-buf and remove DRM code (which is only a
here on top of dma-buf) we get new driver for dma-buf for Xen.

This is why I have 2 options here: either create a dedicated driver
(e.g. re-work xen-zcopy to be DRM independent and put it under
drivers/xen/xen-dma-buf, for example) or extend the existing gntdev
with the above UAPI + make changes to the balloon driver to provide
API for DMA buffer allocations.
Which user component would use the new ioctls?
It is currently used by the display backend [1] and will
probably be used by the hyper-dmabuf frontend/backend
(Dongwon from Intel can provide more info on this).
I'm asking because I'm not very fond of adding more linux specific
functions to libgnttab which are not related to a specific Xen version,
but to a kernel version.
Hm, I was not thinking about this UAPI to be added to libgnttab.
It seems it can be used directly w/o wrappers in user-space
Would this program use libgnttab in parallel?
In case of the display backend - yes, for shared rings,
extracting grefs from displif protocol it uses gntdev via
helper library [1]
 If yes how would the two
usage paths be combined (same applies to the separate driver, btw)? The
gntdev driver manages resources per file descriptor and libgnttab is
hiding the file descriptor it is using for a connection.
Ah, at the moment the UAPI was not used in parallel as there were
2 drivers for that: gntdev + xen-zcopy with different UAPIs.
But now, if we extend gntdev with the new API then you are rigth:
either libgnttab needs to be extended or that new part of the
gntdev UAPI needs to be open-coded by the backend
 Or would the
user program use only the new driver for communicating with the gntdev
driver? In this case it might be an option to extend the gntdev driver
to present a new device (e.g. "gntdmadev") for that purpose.
No, it seems that libgnttab and this new driver's UAPI will be used
in parallel
So doing this in a separate driver seems to be the better option in
this regard.
Well, from maintenance POV it is easier for me to have it all in
a separate driver as all dma-buf related functionality will
reside at one place. This also means that no changes to existing
drivers will be needed (if it is ok to have ballooning in/out
code for DMA buffers (allocated with dma_alloc_xxx) not in the balloon
I think in the end this really depends on how the complete solution
will look like. gntdev is a special wrapper for the gnttab driver.
In case the new dma-buf driver needs to use parts of gntdev I'd rather
have a new driver above gnttab ("gntuser"?) used by gntdev and dma-buf.
The new driver doesn't use gntdev's existing API, but extends it,
e.g. by adding new ways to export/import grefs for a dma-buf and
manage dma-buf's kernel ops. Thus, gntdev, which already provides
UAPI, seems to be a good candidate for such an extension
So this would mean you need a modification of libgnttab, right? This is
something the Xen tools maintainers need to decide. In case they don't
object extending the gntdev driver would be the natural thing to do.

That should be fine. Most of libgnttab does is to wrap existing kernel
interfaces and expose them sensibly to user space programs. If gnttab
device is extended, libgnttab should be extended accordingly. If a new
device is created, a new library should be added. Either way there will
be new toolstack code involved, which is not a problem in general.
Great, so finally I see the following approach to have generic
dma-buf use-cases support for Xen (which can be used for many purposes,
e.g. GPU/DRM buffer sharing, V4L, hyper-dmabuf etc.):

1. Extend Linux gntdev driver to support 3 new IOCTLs discussed previously
2. Extend libgnttab to provide UAPI for those - Linux only as dma-buf
is a Linux thing
3. Extend kernel API of the Linux balloon driver to allow dma_alloc_xxx way
of memory allocations

If the above looks ok, then I can start prototyping, so we can discuss
implementation details

Dongwong - could you please comment on all this if it fits your use-cases
(I do believe it does)?

Thank you,