Re: Userfaultfd doesn't seem to break out of poll on fd close
From: Peter Xu
Date: Wed Apr 15 2020 - 21:38:11 EST
On Wed, Apr 15, 2020 at 06:15:26PM -0700, Brian Geffon wrote:
> Hi Andrea,
> Thanks for taking the time to reply.
>
> > static int userfaultfd_flush(struct file *file, fl_owner_t id)
> > {
> > struct userfaultfd_ctx *ctx = file->private_data;
> > wake_up_poll(&ctx->fd_wqh, EPOLLHUP);
> > }
> >
>
> Yes, I think that something like this would work for this situation and eventfd.
>
> > If eventfd and pipes all behave identical to uffd (they should as they
> > don't seem to implement flush) I'm not sure if there's good enough
> > justification to deviate from the default VFS behavior here.
>
> Pipes actually behave a little differently, in the case that you close
> the write end of the pipe the read end will break out of the poll with
> EPOLLHUP, but I suppose closing the read end while the read end is
> being polled would be more analogous to what I'm describing here. And
> this is why it felt weird to me, in these situations the kernel
> _knows_ that after the close nothing can happen on the file
> descriptor, so what's the point of keeping it in a poll? As soon as
> the poll breaks any read, write, ioctl, etc on the fd whether it's a
> userfaultfd or an eventfd would fail with -EBADF.
>
> And all of that I guess makes sense in the case of a non-blocking fd,
> but what about the case of a blocking file descriptor? Both
> userfaultfd and eventfd can seemingly be stuck in a read syscall with
> no way to break them out when the userfaultfd/eventfd has no further
> utility. Here is an example:
> https://gist.github.com/bgaff/607302d86d99ac539efca307ce2dd679
>
> For my use case adding an eventfd on poll works well, so thank you for
> that suggestion. But the behavior just seemed odd to me which is why I
> started this thread.
Hi, Brian,
I think I can understand you on the weirdness when comparing to the
pipes. And IIUC that's majorly what POLLHUP is used for - it tells us
that the channel has closed. I believe it's the same to a pair of
send/recv sockets when one end closes the port so the other side can
get a POLLHUP.
However IMO userfaultfd is not such a channel like pipes, as you have
already mentioned. It's not paired ports. As you've given the other
example on "closing the read pipe when reading the read pipe" - I'm
curious what will happen for that. I feel like it'll happen the same
way as being blocked, just like what userfaultfd and eventfd are
doing. My understanding is that the Linux kernel should be thread
safe on all these operations so no matter how we use the syscalls and
in what order the kernel shouldn't break with this. However IMHO it
does not mean that it'll guarantee things like "close() will kick all
existing fd operations". I don't know whether there's any restriction
in POSIX or anything for this, but... I won't be too surprised if
someone tells me there's some OS that will directly crash the process
if one fd is close()ed during a read()...
Thanks,
--
Peter Xu