Re: [KVM PATCH v3 2/2] kvm: add support for irqfd viaeventfd-notification interface

From: Al Viro
Date: Sun May 03 2009 - 15:02:18 EST


On Sun, May 03, 2009 at 11:07:26AM -0700, Davide Libenzi wrote:
> On Sun, 3 May 2009, Al Viro wrote:
> > On Mon, Apr 27, 2009 at 02:33:34PM -0400, Gregory Haskins wrote:
> > > + /* We re-use eventfd for irqfd */
> > > + fd = sys_eventfd2(0, 0);
> > > + if (fd < 0) {
> > > + ret = fd;
> > > + goto fail;
> > > + }
> > > +
> > > + /* We maintain a reference to eventfd for the irqfd lifetime */
> > > + file = eventfd_fget(fd);
> > > + if (IS_ERR(file)) {
> > > + ret = PTR_ERR(file);
> > > + goto fail;
> > > + }
> > > +
> > > + irqfd->file = file;
> >
> > This is just plain wrong. You have no promise whatsoever that caller of
> > that sucker won't race with e.g. dup2(). IOW, you can't assume that
> > file will be of the expected kind.
>
> The eventfd_fget() checks for the file_operations pointer, before
> returning the file*, and fails if the fd in not an eventfd. Or you have
> other concerns?

OK, but... it's still wrong. Descriptor numbers are purely for interaction
with userland; using them that way violates very general race-prevention
rules, even if you do paper over the worst of consequences with check in
eventfd_fget().

General rules:
* descriptor you've generated is fit only for return to userland;
* descriptor you've got from userland is fit only for *single*
fget() or equivalent, unless you are one of the core syscalls manipulating
the descriptor table itself (dup2, etc.)
* once file is installed in descriptor table, you'd better be past
the last failure exit; sys_close() on cleanup path is not acceptable.
That's what reserving descriptors is for.

IOW, the sane solution would be to export something that returns your
struct file *. And stop playing with fd completely.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/