Re: Please stop using iopl() in DPDK

From: Andy Lutomirski
Date: Fri Oct 25 2019 - 10:46:03 EST


On Thu, Oct 24, 2019 at 11:42 PM Willy Tarreau <w@xxxxxx> wrote:
>
> Hi Andy,
>
> On Thu, Oct 24, 2019 at 09:45:56PM -0700, Andy Lutomirski wrote:
> > Hi all-
> >
> > Supporting iopl() in the Linux kernel is becoming a maintainability
> > problem. As far as I know, DPDK is the only major modern user of
> > iopl().
> >
> > After doing some research, DPDK uses direct io port access for only a
> > single purpose: accessing legacy virtio configuration structures.
> > These structures are mapped in IO space in BAR 0 on legacy virtio
> > devices.
> >
> > There are at least three ways you could avoid using iopl(). Here they
> > are in rough order of quality in my opinion:
> (...)
>
> I'm just wondering, why wouldn't we introduce a sys_ioport() syscall
> to perform I/Os in the kernel without having to play at all with iopl()/
> ioperm() ? That would alleviate the need for these large port maps.
> Applications that use outb/inb() usually don't need extreme speeds.
> Each time I had to use them, it was to access a watchdog, a sensor, a
> fan, control a front panel LED, or read/write to NVRAM. Some userland
> drivers possibly don't need much more, and very likely run with
> privileges turned on all the time, so replacing their inb()/outb() calls
> would mostly be a matter of redefining them using a macro to use the
> syscall instead.
>
> I'd see an API more or less like this :
>
> int ioport(int op, u16 port, long val, long *ret);

Hmm. I have some memory of a /dev/ioport or similar, but now I can't
find it. It does seem quite reasonable.

But, for uses like DPDK, /sys/.../resource0 seems like a *far* better
API, since it actually uses the kernel's concept of which io range
corresponds to which device instead of hoping that the mappings don't
change out from under user code. And it has the added benefit that
it's restricted to a single device.

--Andy