Re: [KVM PATCH v9 0/5] irqfd fixes and enhancements

From: Gregory Haskins
Date: Mon Jul 06 2009 - 12:42:30 EST


Michael S. Tsirkin wrote:
> On Mon, Jul 06, 2009 at 10:56:02AM -0400, Gregory Haskins wrote:
>
>> Avi Kivity wrote:
>>
>>> On 07/02/2009 06:50 PM, Avi Kivity wrote:
>>>
>>>> On 07/02/2009 06:37 PM, Gregory Haskins wrote:
>>>>
>>>>> (Applies to kvm.git/master:1f9050fd)
>>>>>
>>>>> The following is the latest attempt to fix the races in
>>>>> irqfd/eventfd, as
>>>>> well as restore DEASSIGN support. For more details, please read the
>>>>> patch
>>>>> headers.
>>>>>
>>>>> As always, this series has been tested against the kvm-eventfd unit
>>>>> test
>>>>> and everything appears to be functioning properly. You can download
>>>>> this
>>>>> test here:
>>>>>
>>>> Applied, thanks.
>>>>
>>>>
>>> ... and unapplied. There's a refcounting mismatch in irqfd_cleanup: a
>>> reference is taken for each irqfd, but dropped for each guest. This
>>> causes an oops if a guest with no irqfds is created and destroyed:
>>>
>> I was able to reproduce this issue. The problem turned out to be that I
>> inadvertently always did a flush_workqueue(), even if the work-queue was
>> never initialized.
>>
>> The following interdiff applied to the reverted patch has been confirmed
>> to fix the issue:
>>
>
> Could you document the init boolean and its locking rules?
> The best place to put it would be where the field is declared btw.
>

Will do

> Is it true that init === list_empty(&kvm->irqfds.items)?
> If yes maybe we don't need this field at all.
>
>
No, because its more difficult to maintain the work-queue when
referenced against active irqfds (*). So instead, its maintained
against guests that use irqfd, whether they have an active irqfd or
not. Otherwise you have to contend with the eventfd-side release, which
is a little tricky.

(*) I'm sure its not rocket science to get this working, but it was
getting more complex than I thought it was worth, so I simplified the
model to be per-vm. Note that this design decision/limitation is
declared in the patch header.
>
>> -------------------
>>
>> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
>> index fcc3469..52b0e04 100644
>> --- a/virt/kvm/eventfd.c
>> +++ b/virt/kvm/eventfd.c
>> @@ -318,6 +318,9 @@ kvm_irqfd_deassign(struct kvm *kvm, int fd, int gsi)
>> struct _irqfd *irqfd, *tmp;
>> struct eventfd_ctx *eventfd;
>>
>> + if (!kvm->irqfds.init)
>> + return -ENOENT;
>> +
>> eventfd = eventfd_ctx_fdget(fd);
>> if (IS_ERR(eventfd))
>> return PTR_ERR(eventfd);
>>
>
> wouldn't it be cleaner to error out in the for each loop if we don't
> find an entry to deactivate? Might be helpful for apps to get an error
> if they didn't deassign anything.
>

Again, irqfds.init is somewhat orthogonal to whether the list is
populated or not. This check is for sanity (how can you deassign if you
didnt assign, etc). Normally this would be a simple BUG_ON() sanity
check, but I don't want a malicious/broken userspace to gain an easy
attack vector ;)

>
>> @@ -360,6 +363,9 @@ kvm_irqfd_release(struct kvm *kvm)
>> {
>> struct _irqfd *irqfd, *tmp;
>>
>> + if (!kvm->irqfds.init)
>> + return;
>> +
>>
>
> So here, I recall some old comment that flush below was
> needed even if list is empty. Is this no longer true?
>

If you are using irqfd, its true. If irqfds.init == false, you are not
using irqfd and thus the flush cannot be needed.

> If not it might be cleaner to only flush if list is not empty.
>
>
You have to flush if irqfds.init == true even if the list is empty
because you need to be sure that eventfd-side releases complete. They
may have already removed themselves from the list, but the work-item is
still in flight.

Regards,
-Greg

Attachment: signature.asc
Description: OpenPGP digital signature