Re: [PATCH v3b] eventfd: convert to f_op->read_iter()

From: Jens Axboe
Date: Fri May 01 2020 - 15:08:24 EST


On 5/1/20 1:00 PM, Al Viro wrote:
> On Fri, May 01, 2020 at 11:54:01AM -0600, Jens Axboe wrote:
>
>> @@ -427,8 +424,17 @@ static int do_eventfd(unsigned int count, int flags)
>>
>> fd = anon_inode_getfd("[eventfd]", &eventfd_fops, ctx,
>> O_RDWR | (flags & EFD_SHARED_FCNTL_FLAGS));
>> - if (fd < 0)
>> + if (fd < 0) {
>> eventfd_free_ctx(ctx);
>> + } else {
>> + struct file *file;
>> +
>> + file = fget(fd);
>> + if (file) {
>> + file->f_mode |= FMODE_NOWAIT;
>> + fput(file);
>> + }
>
> No. The one and only thing you can do to return value of anon_inode_getfd() is to
> return the fscker to userland. You *CAN* *NOT* assume that descriptor table is
> still pointing to whatever you've just created.
>
> As soon as it's in descriptor table, it's out of your hands. And frankly, if you
> are playing with descriptors, you should be very well aware of that.
>
> Descriptor tables are fundamentally shared objects; they *can* be accessed and
> modified by other threads, right behind your back.
>
> *IF* you are going to play with ->f_mode, you must use get_unused_fd_flags(),
> anon_inode_getfile(), modify ->f_mode of the result and use fd_install() to
> put it into descriptor table. With put_unused_fd() as cleanup in case
> of allocation failure.

OK, that makes sense, so we've got f_mode set before the fd_install() and fd
visibility. I wrote that up, will test, and send out a v4... Thanks Al, this
is very helpful.

--
Jens Axboe