Re: [Xen-devel] [PATCH 2/2] xen: make xen_qlock_wait() nestable

From: Juergen Gross
Date: Mon Oct 01 2018 - 05:03:14 EST


On 01/10/2018 10:57, Jan Beulich wrote:
>>>> On 01.10.18 at 09:16, <jgross@xxxxxxxx> wrote:
>> xen_qlock_wait() isn't safe for nested calls due to interrupts. A call
>> of xen_qlock_kick() might be ignored in case a deeper nesting level
>> was active right before the call of xen_poll_irq():
>>
>> CPU 1: CPU 2:
>> spin_lock(lock1)
>> spin_lock(lock1)
>> -> xen_qlock_wait()
>> -> xen_clear_irq_pending()
>> Interrupt happens
>> spin_unlock(lock1)
>> -> xen_qlock_kick(CPU 2)
>> spin_lock_irqsave(lock2)
>> spin_lock_irqsave(lock2)
>> -> xen_qlock_wait()
>> -> xen_clear_irq_pending()
>> clears kick for lock1
>> -> xen_poll_irq()
>> spin_unlock_irq_restore(lock2)
>> -> xen_qlock_kick(CPU 2)
>> wakes up
>> spin_unlock_irq_restore(lock2)
>> IRET
>> resumes in xen_qlock_wait()
>> -> xen_poll_irq()
>> never wakes up
>>
>> The solution is to disable interrupts in xen_qlock_wait() and not to
>> poll for the irq in case xen_qlock_wait() is called in nmi context.
>
> Are precautions against NMI really worthwhile? Locks acquired both
> in NMI context as well as outside of it are liable to deadlock anyway,
> aren't they?

The locks don't need to be the same. A NMI-only lock tried to be
acquired with xen_qlock_wait() for another lock having been interrupted
by the NMI will be enough to risk the issue.

So yes, I believe the test for NMI is good to have.


Juergen