Re: [RFC PATCH v4 06/10] VFIO_PLATFORM: Read and write support forthe device fd

From: Scott Wood
Date: Mon Feb 10 2014 - 18:13:12 EST


On Mon, 2014-02-10 at 15:45 -0700, Alex Williamson wrote:
> On Sat, 2014-02-08 at 18:29 +0100, Antonios Motakis wrote:
> > VFIO returns a file descriptor which we can use to manipulate the memory
> > regions of the device. Since some memory regions we cannot mmap due to
> > security concerns, we also allow to read and write to this file descriptor
> > directly.
> >
> > Signed-off-by: Antonios Motakis <a.motakis@xxxxxxxxxxxxxxxxxxxxxx>
> > Tested-by: Alvise Rigo <a.rigo@xxxxxxxxxxxxxxxxxxxxxx>
> > ---
> > drivers/vfio/platform/vfio_platform.c | 128 +++++++++++++++++++++++++++++++++-
> > 1 file changed, 125 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
> > index f7db5c0..ee96078 100644
> > --- a/drivers/vfio/platform/vfio_platform.c
> > +++ b/drivers/vfio/platform/vfio_platform.c
> > @@ -55,7 +55,8 @@ static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
> >
> > region.addr = res->start;
> > region.size = resource_size(res);
> > - region.flags = 0;
> > + region.flags = VFIO_REGION_INFO_FLAG_READ
> > + | VFIO_REGION_INFO_FLAG_WRITE;
> >
> > vdev->region[i] = region;
> > }
> > @@ -150,13 +151,134 @@ static long vfio_platform_ioctl(void *device_data,
> > static ssize_t vfio_platform_read(void *device_data, char __user *buf,
> > size_t count, loff_t *ppos)
> > {
> > - return 0;
> > + struct vfio_platform_device *vdev = device_data;
> > + unsigned int *io;
> > + int i;
> > +
> > + for (i = 0; i < vdev->num_regions; i++) {
> > + struct vfio_platform_region region = vdev->region[i];
> > + unsigned int done = 0;
> > + loff_t off;
> > +
> > + if ((*ppos < region.addr)
> > + || (*ppos + count - 1) >= (region.addr + region.size))
> > + continue;
>
> Perhaps there's something to be said for vfio-pci's use of fixed offsets
> to have a direct offset to index lookup.
>
> > +
> > + io = ioremap_nocache(region.addr, region.size);
>
> This must incur some overhead per access.

There's mmap() if you want fast... Given the limited ioremap space on
32-bit, I can see not wanting to map everything that the user has open
all the time -- but in that case, wouldn't it be better to just map one
page here rather than the whole region?

> > +
> > + off = *ppos - region.addr;
> > +
> > + while (count) {
> > + size_t filled;
> > +
> > + if (count >= 4 && !(off % 4)) {
> > + u32 val;
> > +
> > + val = ioread32(io + off);
> > + if (copy_to_user(buf, &val, 4))
> > + goto err;
>
> For vfio-pci we've decided that these interfaces are always little
> endian, have you considered whether it makes sense to do something
> similar here? Thanks,

ioread32() is little endian -- but since read() puts its result in the
caller's memory buffer (rather than a register return), I think it makes
more sense to preserve byte-invariance -- similar to the conclusion of
the recent KVM MMIO API clarification discussion. Then the VFIO user
would use the same type of access (byte swapped or not) to access the
read() buffer that they would have used to access the register directly.

Forcing little endian is a better fit for PCI (which is inherently
little endian) than for platform devices which can be either endianness.

-Scott


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/