Re: [PATCH v3 0/4] open/accept directly into io_uring fixed file table

From: Josh Triplett
Date: Mon Aug 23 2021 - 15:14:01 EST


On Sat, Aug 21, 2021 at 08:18:12PM -0600, Jens Axboe wrote:
> On 8/21/21 9:52 AM, Pavel Begunkov wrote:
> > Add an optional feature to open/accept directly into io_uring's fixed
> > file table bypassing the normal file table. Same behaviour if as the
> > snippet below, but in one operation:
> >
> > sqe = prep_[open,accept](...);
> > cqe = submit_and_wait(sqe);
> > io_uring_register_files_update(uring_idx, (fd = cqe->res));
> > close((fd = cqe->res));
> >
> > The idea in pretty old, and was brough up and implemented a year ago
> > by Josh Triplett, though haven't sought the light for some reasons.
> >
> > The behaviour is controlled by setting sqe->file_index, where 0 implies
> > the old behaviour. If non-zero value is specified, then it will behave
> > as described and place the file into a fixed file slot
> > sqe->file_index - 1. A file table should be already created, the slot
> > should be valid and empty, otherwise the operation will fail.
> >
> > we can't use IOSQE_FIXED_FILE to switch between modes, because accept
> > takes a file, and it already uses the flag with a different meaning.
> >
> > since RFC:
> > - added attribution
> > - updated descriptions
> > - rebased
> >
> > since v1:
> > - EBADF if slot is already used (Josh Triplett)
> > - alias index with splice_fd_in (Josh Triplett)
> > - fix a bound check bug
>
> With the prep series, this looks good to me now. Josh, what do you
> think?

I would still like to see this using a union with the `nofile` field in
io_open and io_accept, rather than overloading the 16-bit buf_index
field. That would avoid truncating to 16 bits, and make less work for
expansion to more than 16 bits of fixed file indexes.

(I'd also like that to actually use a union, rather than overloading the
meaning of buf_index/nofile.)

I personally still feel that using non-zero to signify index-plus-one is
both error-prone and not as future-compatible. I think we could do
better with no additional overhead. But I think the final call on that
interface is up to you, Jens. Do you think it'd be worth spending a flag
bit or using a different opcode, to get a cleaner interface? If you
don't, then I'd be fine with seeing this go in with just the io_open and
io_accept change.

- Josh Triplett