Re: [dpdk-dev] Please stop using iopl() in DPDK

From: Stephen Hemminger
Date: Fri Oct 25 2019 - 12:13:17 EST

On Thu, 24 Oct 2019 21:45:56 -0700
Andy Lutomirski <luto@xxxxxxxxxx> wrote:

> Hi all-
> Supporting iopl() in the Linux kernel is becoming a maintainability
> problem. As far as I know, DPDK is the only major modern user of
> iopl().
> After doing some research, DPDK uses direct io port access for only a
> single purpose: accessing legacy virtio configuration structures.
> These structures are mapped in IO space in BAR 0 on legacy virtio
> devices.

Yes. Legacy virtio seems to have been designed without consideration
of how to use it in userspace. Xen, Vmware and Hyper-V all use memory
as a doorbell mechanism which is easier to use from userspace.

> There are at least three ways you could avoid using iopl(). Here they
> are in rough order of quality in my opinion:
> 1. Change pci_uio_ioport_read() and pci_uio_ioport_write() to use
> read() and write() on resource0 in sysfs.

The cost of entering the kernel for a doorbell mechanism is too
expensive and would kill performance.

> 2. Use the alternative access mechanism in the virtio legacy spec:
> there is a way to access all of these structures via configuration
> space.

There is no way to use memory doorbell on older versions of virtio.
Users want to run DPDK on old stuff like RHEL6 and even older
kernel forks. There are even use cases where virtio is used for
a non-Linux host; such as GCP.

> 3. Use ioperm() instead of iopl().

Ioperm has the wrong thread semantics. All DPDK applications have
multiple threads and the initialization logic needs to work even
if the thread is started later; threads can also be started by
the user application.

Iopl applies to whole process so this is not an issue.

> We are considering changes to the kernel that will potentially harm
> the performance of any program that uses iopl(3) -- in particular,
> context switches will become more expensive, and the scheduler might
> need to explicitly penalize such programs to ensure fairness. Using
> ioperm() already hurts performance, and the proposed changes to iopl()
> will make it even worse. Alternatively, the kernel could drop iopl()
> support entirely. I will certainly make a change to allow
> distributions to remove iopl() support entirely from their kernels,
> and I expect that distributions will do this.
> Please fix DPDK.

Please fix virtio.