Re: [PATCH RFC] kvm: enable irq injection from interrupt context

From: Gleb Natapov
Date: Thu Sep 16 2010 - 08:33:14 EST


On Thu, Sep 16, 2010 at 02:13:38PM +0200, Michael S. Tsirkin wrote:
> > > We haver two users: qemu does deasserts, vhost-net does asserts.
> > Well this is broken. You want KVM to track level for you and this is
> > wrong. KVM does this anyway because it can't relay on devise model
> > to behave correctly [0], but in your case it is designed to behave
> > incorrectly.
> >
> > Interrupt type is a device property. PCI devices just happen to be level
> > triggered according to PCI spec. What if you want to use vhost-net to
> > implement network device which has active-low interrupt line? [1]
>
> The polarity would have to be reversed in gsi (irq line can be shared,
> all devices must be active high or low consistently).
>
There are gsi dedicated to PCI. They can be shared only between PCI
devices.

> > If you want to split parts that asserts irq and de-asserts it then we
> > should have irqfd that tracks line status and knows interrupt line
> > polarity.
>
> Yes, it can know about polarity even though I think it's cleaner to do this
> per gsi. But it can not track line status as line is shared with
> other devices.
It should track only device's line status.

>
> > > Another application is out of process virtio (sandboxing!).
> > It will still assert and de-assert irq at the same code, so it will be
> > able to track irq line status.
> >
> > > Again, pci stuff needs to stay in qemu.
> > >
> >
> > Nothing to do with PCI whatsoever.
> >
> > [0] most qemu devices behave incorrectly and trigger level irq more then
> > needed.
>
> Which devices?
Most of them. They just call update_irq_status() or something and
re-assert interrupt regardless of what previous status was.

> pci core tracks line status and will never assert the same
> line multiple times.
That's good if pci core does this, but device shouldn't even try it.

>
> > [1] this is how correct PCI device should behave but we override
> > polarity in ACPI, but now incorrect behaviour is deeply designed
> > into vhost-net.
>
> Not really, vhost net signals an eventfd. What happens then is
> up to kvm.
>
That is what current broken design does and it works, but if you want to
save unneeded calls into kvm fix design.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/