Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver

From: Vlad Zolotarov
Date: Tue Oct 06 2015 - 04:38:12 EST




On 10/05/15 22:50, Michael S. Tsirkin wrote:
On Sun, Oct 04, 2015 at 11:43:15PM +0300, Vlad Zolotarov wrote:
This series add support for MSI and MSI-X interrupts to uio_pci_generic driver.
Currently uio_pci_generic supports only legacy INT#x interrupts source. However
there are situations when this is not enough, for instance SR-IOV VF devices that
simply don't have INT#x capability. For such devices uio_pci_generic will simply
fail (more specifically probe() will fail).
When IOMMU is either not available (e.g. Amazon EC2) or not acceptable due to performance
overhead and thus VFIO is not an option
users that develop user-space drivers are left
without any option but to develop some proprietary UIO drivers (e.g. igb_uio driver in Intel's
DPDK) just to be able to use UIO infrastructure.
This series provides a generic solution for this problem while preserving the original behaviour
for devices for which the original uio_pci_generic had worked before (i.e. INT#x will be used by default).
What is missing here is that drivers using uio_pci_generic generally
poke at config and BAR sysfs files of the device.

We can not stop them without breaking existing users, but this means
that we can't enable bus mastering and MSI/MSI-X blindly: userspace
bugs will corrupt the MSI-X table and/or MSi/MSI-X capability,
and cause device to overwrite random addresses, corrupting kernel
memory.

Your solution seems to be a warning in dmesg and tainting the
kernel, but that's not enough.

You need to add infrastructure to prevent this.

Bus mastering is easily enabled from the user space (taken from DPDK code):

static int
pci_uio_set_bus_master(int dev_fd)
{
uint16_t reg;
int ret;

ret = pread(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
if (ret != sizeof(reg)) {
RTE_LOG(ERR, EAL,
"Cannot read command from PCI config space!\n");
return -1;
}

/* return if bus mastering is already on */
if (reg & PCI_COMMAND_MASTER)
return 0;

reg |= PCI_COMMAND_MASTER;

ret = pwrite(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
if (ret != sizeof(reg)) {
RTE_LOG(ERR, EAL,
"Cannot write command to PCI config space!\n");
return -1;
}

return 0;
}

So, this is a non-issue. ;)


VFIO has some code to do this, but it's not bound by existing UIO API so it
simply fails the mmap. We want I think existing applications to work,
so I suspect we need to make a hole there (probably map a zero page in
case apps want to read it, and maybe even set it up for COW in case they
tweak the PBA which sometimes happens to be in the same page).

Your patches also seem to add in eventfd and mmap capabilities which
seems to be orthogonal. They are there in VFIO which I'm guessing is the
real reason you do it.

So, what you are trying to do might be closer to extending VFIO which
already has a bunch of checks like that. Yes, it also wants to program
the IOMMU. So maybe do it with a separate device that can be root-only,
so unpriveledged users can't abuse it.

You should Cc, and talk to the VFIO maintainer.



New in v3:
- Add __iomem qualifier to temp buffer receiving ioremap value.

New in v2:
- Added #include <linux/uaccess.h> to uio_pci_generic.c

Vlad Zolotarov (3):
uio: add ioctl support
uio_pci_generic: add MSI/MSI-X support
Documentation: update uio-howto

Documentation/DocBook/uio-howto.tmpl | 29 ++-
drivers/uio/uio.c | 15 ++
drivers/uio/uio_pci_generic.c | 410 +++++++++++++++++++++++++++++++++--
include/linux/uio_driver.h | 3 +
include/linux/uio_pci_generic.h | 36 +++
5 files changed, 467 insertions(+), 26 deletions(-)
create mode 100644 include/linux/uio_pci_generic.h

--
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/