Re: [Xen-devel] xen/evtchn and forced threaded irq

From: Julien Grall
Date: Wed Feb 27 2019 - 06:09:42 EST


Hi,

On 2/26/19 11:02 AM, Roger Pau Monnà wrote:
On Tue, Feb 26, 2019 at 10:26:21AM +0000, Julien Grall wrote:
On 26/02/2019 10:17, Roger Pau Monnà wrote:
On Tue, Feb 26, 2019 at 10:03:38AM +0000, Julien Grall wrote:
Hi Roger,

On 26/02/2019 09:44, Roger Pau Monnà wrote:
On Tue, Feb 26, 2019 at 09:30:07AM +0000, Andrew Cooper wrote:
On 26/02/2019 09:14, Roger Pau Monnà wrote:
On Mon, Feb 25, 2019 at 01:55:42PM +0000, Julien Grall wrote:
Hi Oleksandr,

On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
On 2/22/19 3:33 PM, Julien Grall wrote:
Hi,

On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
On 2/20/19 10:46 PM, Julien Grall wrote:
Discussing with my team, a solution that came up would be to
introduce one atomic field per event to record the number of
event received. I will explore that solution tomorrow.
How will this help if events have some payload?
What payload? The event channel does not carry any payload. It only
notify you that something happen. Then this is up to the user to
decide what to you with it.
Sorry, I was probably not precise enough. I mean that an event might have
associated payload in the ring buffer, for example [1]. So, counting events
may help somehow, but the ring's data may still be lost
From my understanding of event channels are edge interrupts. By definition,
IMO event channels are active high level interrupts.

Let's take into account the following situation: you have an event
channel masked and the event channel pending bit (akin to the line on
bare metal) goes from low to high (0 -> 1), then you unmask the
interrupt and you get an event injected. If it was an edge interrupt
you wont get an event injected after unmasking, because you would
have lost the edge. I think the problem here is that Linux treats
event channels as edge interrupts, when they are actually level.

Event channels are edge interrupts. There are several very subtle bugs
to be had by software which treats them as line interrupts.

Most critically, if you fail to ack them, rebind them to a new vcpu, and
reenable interrupts, you don't get a new interrupt notification. This
was the source of a 4 month bug when XenServer was moving from
classic-xen to PVOps where using irqbalance would cause dom0 to
occasionally lose interrupts.

I would argue that you need to mask them first, rebind to a new vcpu
and unmask, and then you will get an interrupt notification, or this
should be fixed in Xen to work as you expect: trigger an interrupt
notification when moving an asserted event channel between CPUs.

Is there any document that describes how such non trivial things (like
moving between CPUs) work for event/level interrupts?

Maybe I'm being obtuse, but from the example I gave above it's quite
clear to me event channels don't get triggered based on edge changes,
but rather on the line level.

Your example above is not enough to give the semantics of level. You would
only use the MASK bit if your interrupt handler is threaded to avoid the
interrupt coming up again.

So if you remove the mask from the equation, then the interrupt flow should be:

1) handle interrupt
2) EOI

This is bogus if you don't mask the interrupt source. You should
instead do

1) EOI
2) Handle interrupt

And loop over this.
So that's not a level semantics. It is a edge one :). In the level case, you
would clear the state once you are done with the interrupt.

Also, it would be ACK and not EOI.

For level triggered interrupts you have to somehow signal the device
to stop asserting the line, which doesn't happen for Xen devices
because they just signal interrupts to Xen, but don't have a way to
keep event channels asserted, so I agree that this is different from
traditional level interrupts because devices using event channels
don't have a way to keep lines asserted.

I guess the most similar native interrupt is MSI with masking
support?

I don't know enough about MSI with masking support to be able to draw a comparison :).

The flow I have been suggested to re-use in Linux is handle_fasteoi_ack_irq. I haven't yet had time to have a try at it.

Cheers,

--
Julien Grall