Re: xen/evtchn and forced threaded irq

From: Boris Ostrovsky
Date: Wed Feb 20 2019 - 16:47:54 EST


On 2/20/19 3:46 PM, Julien Grall wrote:
> (+ Andrew and Jan for feedback on the event channel interrupt)
>
> Hi Boris,
>
> Thank you for the your feedback.
>
> On 2/20/19 8:04 PM, Boris Ostrovsky wrote:
>> On 2/20/19 1:05 PM, Julien Grall wrote:
>>> Hi,
>>>
>>> On 20/02/2019 17:07, Boris Ostrovsky wrote:
>>>> On 2/20/19 9:15 AM, Julien Grall wrote:
>>>>> Hi Boris,
>>>>>
>>>>> Thank you for your answer.
>>>>>
>>>>> On 20/02/2019 00:02, Boris Ostrovsky wrote:
>>>>>> On Tue, Feb 19, 2019 at 05:31:10PM +0000, Julien Grall wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I have been looking at using Linux RT in Dom0. Once the guest is
>>>>>>> started,
>>>>>>> the console is ending to have a lot of warning (see trace below).
>>>>>>>
>>>>>>> After some investigation, this is because the irq handler will now
>>>>>>> be threaded.
>>>>>>> I can reproduce the same error with the vanilla Linux when passing
>>>>>>> the option
>>>>>>> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7
>>>>>>> that has
>>>>>>> not RT support).
>>>>>>>
>>>>>>> FWIW, the interrupt for port 6 is used to for the guest to
>>>>>>> communicate with
>>>>>>> xenstore.
>>>>>>>
>>>>>>> ÂÂÂFrom my understanding, this is happening because the interrupt
>>>>>>> handler is now
>>>>>>> run in a thread. So we can have the following happening.
>>>>>>>
>>>>>>> ÂÂÂÂÂ Interrupt contextÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂ Interrupt thread
>>>>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
>>>>>>> ÂÂÂÂÂ receive interrupt port 6ÂÂÂÂ |
>>>>>>> ÂÂÂÂÂ clear the evtchn portÂÂÂÂÂÂÂ |
>>>>>>> ÂÂÂÂÂ set IRQF_RUNTHREADÂÂÂÂÂÂÂÂÂÂÂ |
>>>>>>> ÂÂÂÂÂ kick interrupt threadÂÂÂÂÂÂÂ |
>>>>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂ clear IRQF_RUNTHREAD
>>>>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂ call evtchn_interrupt
>>>>>>> ÂÂÂÂÂ receive interrupt port 6ÂÂÂÂ |
>>>>>>> ÂÂÂÂÂ clear the evtchn portÂÂÂÂÂÂÂ |
>>>>>>> ÂÂÂÂÂ set IRQF_RUNTHREADÂÂÂÂÂÂÂÂÂÂ |
>>>>>>> ÂÂÂÂÂ kick interrupt threadÂÂÂÂÂÂÂ |
>>>>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂ disable interrupt port 6
>>>>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂ evtchn->enabled = false
>>>>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂ [....]
>>>>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
>>>>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂ *** Handling the second
>>>>>>> interrupt ***
>>>>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂ clear IRQF_RUNTHREAD
>>>>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂ call evtchn_interrupt
>>>>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂ WARN(...)
>>>>>>>
>>>>>>> I am not entirely sure how to fix this. I have two solutions in
>>>>>>> mind:
>>>>>>>
>>>>>>> 1) Prevent the interrupt handler to be threaded. We would also
>>>>>>> need to
>>>>>>> switch from spin_lock to raw_spin_lock as the former may sleep on
>>>>>>> RT-Linux.
>>>>>>>
>>>>>>> 2) Remove the warning
>>>>>>
>>>>>> I think access to evtchn->enabled is racy so (with or without the
>>>>>> warning) we can't use it reliably.
>>>>>
>>>>> Thinking about it, it would not be the only issue. The ring is sized
>>>>> to contain only one instance of the same event. So if you receive
>>>>> twice the event, you may overflow the ring.
>>>>
>>>> Hm... That's another argument in favor of "unthreading" the handler.
>>>
>>> I first thought it would be possible to unthread it. However,
>>> wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep,
>>> so this cannot be used in an interrupt context.
>>>
>>> So I think "unthreading" the handler is not an option here.
>>
>> That sounds like a different problem. I.e. there are two issues:
>> * threaded interrupts don't work properly (races, ring overflow)
>> * evtchn_interrupt() (threaded or not) has spin_lock(), which is not
>> going to work for RT
>
> I am afraid that's not correct, you can use spin_lock() in threaded
> interrupt handler.

In non-RT handler -- yes, but not in an RT one (in fact, isn't this what
you yourself said above?)

>
>> The first can be fixed by using non-threaded handlers.
>
> The two are somewhat related, if you use a non-threaded handler then
> you are not going to help the RT case.
>
> In general, the unthreaded solution should be used in the last resort.
>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> Another alternative could be to queue the irq if !evtchn->enabled
>>>>>> and
>>>>>> handle it in evtchn_write() (which is where irq is supposed to be
>>>>>> re-enabled).
>>>>> What do you mean by queue? Is it queueing in the ring?
>>>>
>>>>
>>>> No, I was thinking about having a new structure for deferred
>>>> interrupts.
>>>
>>> Hmmm, I am not entirely sure what would be the structure here. Could
>>> you expand your thinking?
>>
>> Some sort of a FIFO that stores {irq, data} tuple. It could obviously be
>> implemented as a ring but not necessarily as Xen shared ring (if that's
>> what you were referring to).
>
> The underlying question is what happen if you miss an interrupt. Is it
> going to be ok?

This I am not sure about. I thought yes since we are signaling the
process only once.

-boris

> If no, then we have to record everything and can't kill/send an error
> to the user app because it is not its fault.
>
> This means a FIFO would not be a viable. How do you size it? Static
> (i.e pre-defined) size is not going to be possible because you don't
> know how many interrupt you are going to receive before the thread
> handler runs. You can't make it grow dynamically as it make become
> quite big for the same reason.
>
> Discussing with my team, a solution that came up would be to introduce
> one atomic field per event to record the number of event received. I
> will explore that solution tomorrow.
>