Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifierrace conditions

From: Gregory Haskins
Date: Mon Jun 22 2009 - 21:04:08 EST

Davide Libenzi wrote:
> On Mon, 22 Jun 2009, Gregory Haskins wrote:
>> So up in userspace, the vbus pci-device would have an open reference to
>> the kvm guest (derived from /dev/kvm) and an open reference to a vbus
>> (derived from /dev/vbus). Lets call these kvmfd, and vbusfd,
>> respectively. For something like an interrupt, we would hook the point
>> where the PCI-MSI interrupt is assigned, and would do the following:
>> gsi = kvm_irq_route_gsi();
>> fd = eventfd(0, 0);
>> ioctl(kvmfd, KVM_IRQFD_ASSIGN, {fd, gsi});
>> ioctl(vbusfd, VBUS_SHMSIGNAL_ASSIGN, {sigid, fd});
>> So userspace orchestrated the assignment of this one eventfd to a KVM
>> consumer, and a VBUS producer. The two subsystems do not care about the
>> details of the other side of the link, per se. VBUS just knows that it
>> can eventfd_signal() its memory region to tell whomever is listening
>> that it changed. Likewise, KVM just knows to inject "gsi" when it gets
>> signalled. You could equally have given "fd" to a userspace thread for
>> either producer or consumer roles, or any other combination.
>> If we were doing PCI-passthough, substitute the last SHMSIGNAL_ASSIGN
>> ioctl call with some PCI_PASSTHROUGH_ASSIGN verb and you get the idea.
>> The important thing is that once this is established, userspace doesn't
>> necessarily care about the fd anymore. So now the question is: do we
>> keep it around for other things? Do we keep it around because we don't
>> want KVM to see the POLLHUP, or do we address the "release" code so that
>> it works even if userspace issued close(fd) at this point. I am not
>> sure what the answer is, but this is the scenario we are concerned with
>> in this thread. In the example above, vbus is free to produce events on
>> its eventfd until it gets a SHMSIGNAL_DEASSIGN request.
> I see.
> The thing remains, that in order to reliably handle generic
> producer/consumer scenarios you'd need a reference counting similar to
> pipes, where the notion of producer and consumer is very well defined.

I see your point.

Well, I think the more important thing here is that we address the
races, and add support for DEASSIGN. We can do both of those things
with any of the patches that you and I have been kicking around. So
what I propose is that we move forward with whatever patch you bless as
proper for now. This producer-release issue is pretty minor in the
grand scheme of things. We can always just have userspace hold the fd.

I can either take in the last one you sent, or it sounds like you wanted
to possibly do another round of cleanup? Whatever the case may be, let
me know and we can coordinate with Andrew/Avi on what tree to pull it
into. It sounds like riding in kvm.git is the perhaps the most logical.

Thanks Davide,

Attachment: signature.asc
Description: OpenPGP digital signature