Re: [Xen-devel] xen/evtchn and forced threaded irq

From: Andrew Cooper
Date: Tue Feb 26 2019 - 04:30:14 EST


On 26/02/2019 09:14, Roger Pau Monnà wrote:
> On Mon, Feb 25, 2019 at 01:55:42PM +0000, Julien Grall wrote:
>> Hi Oleksandr,
>>
>> On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
>>> On 2/22/19 3:33 PM, Julien Grall wrote:
>>>> Hi,
>>>>
>>>> On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
>>>>> On 2/20/19 10:46 PM, Julien Grall wrote:
>>>>>> Discussing with my team, a solution that came up would be to
>>>>>> introduce one atomic field per event to record the number of
>>>>>> event received. I will explore that solution tomorrow.
>>>>> How will this help if events have some payload?
>>>> What payload? The event channel does not carry any payload. It only
>>>> notify you that something happen. Then this is up to the user to
>>>> decide what to you with it.
>>> Sorry, I was probably not precise enough. I mean that an event might have
>>> associated payload in the ring buffer, for example [1]. So, counting events
>>> may help somehow, but the ring's data may still be lost
>> From my understanding of event channels are edge interrupts. By definition,
> IMO event channels are active high level interrupts.
>
> Let's take into account the following situation: you have an event
> channel masked and the event channel pending bit (akin to the line on
> bare metal) goes from low to high (0 -> 1), then you unmask the
> interrupt and you get an event injected. If it was an edge interrupt
> you wont get an event injected after unmasking, because you would
> have lost the edge. I think the problem here is that Linux treats
> event channels as edge interrupts, when they are actually level.

Event channels are edge interrupts. There are several very subtle bugs
to be had by software which treats them as line interrupts.

Most critically, if you fail to ack them, rebind them to a new vcpu, and
reenable interrupts, you don't get a new interrupt notification. This
was the source of a 4 month bug when XenServer was moving from
classic-xen to PVOps where using irqbalance would cause dom0 to
occasionally lose interrupts.

~Andrew