Re: [PATCH v5 2/3] fs: openat2: Extend open_how to allow userspace-selected fds

From: Josh Triplett
Date: Thu Apr 23 2020 - 00:42:38 EST


On Thu, Apr 23, 2020 at 06:24:14AM +0200, Miklos Szeredi wrote:
> On Thu, Apr 23, 2020 at 2:48 AM Josh Triplett <josh@xxxxxxxxxxxxxxxx> wrote:
> > On Wed, Apr 22, 2020 at 09:55:56AM +0200, Miklos Szeredi wrote:
> > > On Wed, Apr 22, 2020 at 8:06 AM Michael Kerrisk (man-pages)
> > > <mtk.manpages@xxxxxxxxx> wrote:
> > > >
> > > > [CC += linux-api]
> > > >
> > > > On Wed, 22 Apr 2020 at 07:20, Josh Triplett <josh@xxxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > Inspired by the X protocol's handling of XIDs, allow userspace to select
> > > > > the file descriptor opened by openat2, so that it can use the resulting
> > > > > file descriptor in subsequent system calls without waiting for the
> > > > > response to openat2.
> > > > >
> > > > > In io_uring, this allows sequences like openat2/read/close without
> > > > > waiting for the openat2 to complete. Multiple such sequences can
> > > > > overlap, as long as each uses a distinct file descriptor.
> > >
> > > If this is primarily an io_uring feature, then why burden the normal
> > > openat2 API with this?
> >
> > This feature was inspired by io_uring; it isn't exclusively of value
> > with io_uring. (And io_uring doesn't normally change the semantics of
> > syscalls.)
>
> What's the use case of O_SPECIFIC_FD beyond io_uring?

Avoiding a call to dup2 and close, if you need something as a specific
file descriptor, such as when setting up to exec something, or when
debugging a program.

I don't expect it to be as widely used as with io_uring, but I also
don't want io_uring versions of syscalls to diverge from the underlying
syscalls, and this would be a heavy divergence.

> > > This would also allow Implementing a private fd table for io_uring.
> > > I.e. add a flag interpreted by file ops (IORING_PRIVATE_FD), including
> > > openat2 and freely use the private fd space without having to worry
> > > about interactions with other parts of the system.
> >
> > I definitely don't want to add a special kind of file descriptor that
> > doesn't work in normal syscalls taking file descriptors. A file
> > descriptor allocated via O_SPECIFIC_FD is an entirely normal file
> > descriptor, and works anywhere a file descriptor normally works.
>
> What's the use case of allocating a file descriptor within io_uring
> and using it outside of io_uring?

Calling a syscall not provided via io_uring. Calling a library that
doesn't use io_uring. Passing the file descriptor via UNIX socket to
another program. Passing the file descriptor via exec to another
program. Userspace is modular, and file descriptors are widely used.