Re: [PATCH v3b] eventfd: convert to f_op->read_iter()

From: Al Viro
Date: Fri May 01 2020 - 15:00:23 EST


On Fri, May 01, 2020 at 11:54:01AM -0600, Jens Axboe wrote:

> @@ -427,8 +424,17 @@ static int do_eventfd(unsigned int count, int flags)
>
> fd = anon_inode_getfd("[eventfd]", &eventfd_fops, ctx,
> O_RDWR | (flags & EFD_SHARED_FCNTL_FLAGS));
> - if (fd < 0)
> + if (fd < 0) {
> eventfd_free_ctx(ctx);
> + } else {
> + struct file *file;
> +
> + file = fget(fd);
> + if (file) {
> + file->f_mode |= FMODE_NOWAIT;
> + fput(file);
> + }

No. The one and only thing you can do to return value of anon_inode_getfd() is to
return the fscker to userland. You *CAN* *NOT* assume that descriptor table is
still pointing to whatever you've just created.

As soon as it's in descriptor table, it's out of your hands. And frankly, if you
are playing with descriptors, you should be very well aware of that.

Descriptor tables are fundamentally shared objects; they *can* be accessed and
modified by other threads, right behind your back.

*IF* you are going to play with ->f_mode, you must use get_unused_fd_flags(),
anon_inode_getfile(), modify ->f_mode of the result and use fd_install() to
put it into descriptor table. With put_unused_fd() as cleanup in case
of allocation failure.