Re: [PATCH RFC 0/1] VFIO: Region-specific file descriptors

From: Shawn Anastasio
Date: Wed Oct 02 2019 - 17:57:49 EST


On 10/1/19 10:38 AM, Alex Williamson wrote:
On Mon, 30 Sep 2019 18:55:32 -0500 Shawn Anastasio <shawn@xxxxxxxxxx>
wrote:

This patch adds region file descriptors to VFIO, a simple file
descriptor type that allows read/write/mmap operations on a single
region of a VFIO device.

This feature is particularly useful for privileged applications
that use VFIO and wish to share file descriptors with unprivileged
applications without handing over full control of the device.

Such as? How do we defined "privileged"? VFIO already allows "unprivileged applications" to own a device, only file permissions
are necessary for the VFIO group. Does region level granularity
really allow us to claim that the consumer of this fd doesn't have
full control of the device? Clearly device ioctls, including
interrupts, and DMA mappings are not granted with only access to a
region, but said unprivileged application may have absolute full
control of the device itself via that region.

Yes, that's true - determining whether any control was restricted will
depend on the specifics of the device and region shared.

The use case I had in mind when implementing this was QEMU's ivshmem
device. I'm writing a daemon that uses VFIO to establish a shared
memory channel with the host via ivshmem devices and then passes
the shared memory region to unprivileged clients over unix domain
sockets. In this case, it is beneficial to have a way to only share
BAR 2 of the ivshmem device (the shared memory region) without giving
control over device configuration and interrupts.

It would also perhaps be useful to restrict the read/write/mmap
abilities of a region fd at time of creation, though the patch
as-is doesn't implement that.

It also allows applications to use regular offsets in
read/write/mmap instead of the region index + offset that must be
used with device file descriptors.

How is this actually an issue that needs a solution?

It allows applications that expect memfd/shm style semantics to
work without modification. In the use case I mentioned, it allows
the unprivileged clients to use any received shared memory fds without
knowledge of VFIO-specific semantics. This means that the code paths for
the host, where regular memfds are passed, and the guest, where VFIO
region fds are passed can be the same.

The current implementation is very raw (PCI only, no reference
counting which is probably wrong), but I wanted to get a sense to
see if this feature is desired. If it is, tips on how to implement
this more correctly are appreciated.

Handling the ownership and life cycle of the region fds is the more complicated problem. If an unprivileged user has an mmap to a
device owned by a privileged user, how does it get revoked by the
privileged part of this equation?

Yes, this is something that I've been thinking about. IIUC, the current
patch results in all region fds being invalidated when the privileged
process drops the device fd, but this may not be the best solution.

Perhaps having region fds bump the device struct reference counts
so that region fds can outlive device fds makes more sense. This
wouldn't allow the privileged process to revoke region access, though.

How do we decide which regions merit this support, for instance a
device specific region could have just as viable a use case as a BAR.
Why does this code limit support to regions supporting mmap but then
support read/write as well?

This was an arbitrary decision I made while testing and not necessarily
the behavior I wish to keep. Perhaps it would make sense to allow
region fd creation for regions that support any of r, w, mmap.

Technically, isn't the extent of functionality provided in this RFC already available via the PCI resource files in sysfs?

That's a good point, though having the functionality within the
VFIO framework would be nice as well. There's plenty of functionality
already duplicated between sysfs, UIO, and VFIO.

Without a concrete use case, this looks like a solution in search of
a problem. Thanks,

Hopefully the use case I described above makes sense. This is my first
time writing a PCI driver, so I don't know if this use case is a bit
contrived and not applicable outside my specific application.

I don't think it's too unreasonable to think that there are some
other applications where restricting access to specific regions
would be useful, though.

Thanks,
Shawn