Re: [PATCH] xen/events: Don't move disabled irqs
From: Boris Ostrovsky
Date: Tue May 10 2016 - 12:01:30 EST
On 05/10/2016 11:11 AM, Ross Lagerwall wrote:
> Commit ff1e22e7a638 ("xen/events: Mask a moving irq") open-coded
> irq_move_irq() but left out checking if the IRQ is disabled. This broke
> resuming from suspend since it tries to move a (disabled) irq without
> holding the IRQ's desc->lock. Fix it by adding in a check for disabled
> IRQs.
>
> The resulting stacktrace was:
> kernel BUG at /build/linux-UbQGH5/linux-4.4.0/kernel/irq/migration.c:31!
> invalid opcode: 0000 [#1] SMP
> Modules linked in: xenfs xen_privcmd ...
> CPU: 0 PID: 9 Comm: migration/0 Not tainted 4.4.0-22-generic #39-Ubuntu
> Hardware name: Xen HVM domU, BIOS 4.6.1-xs125180 05/04/2016
> task: ffff88003d75ee00 ti: ffff88003d7bc000 task.ti: ffff88003d7bc000
> RIP: 0010:[<ffffffff810e26e2>] [<ffffffff810e26e2>] irq_move_masked_irq+0xd2/0xe0
> RSP: 0018:ffff88003d7bfc50 EFLAGS: 00010046
> RAX: 0000000000000000 RBX: ffff88003d40ba00 RCX: 0000000000000001
> RDX: 0000000000000001 RSI: 0000000000000100 RDI: ffff88003d40bad8
> RBP: ffff88003d7bfc68 R08: 0000000000000000 R09: ffff88003d000000
> R10: 0000000000000000 R11: 000000000000023c R12: ffff88003d40bad0
> R13: ffffffff81f3a4a0 R14: 0000000000000010 R15: 00000000ffffffff
> FS: 0000000000000000(0000) GS:ffff88003da00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fd4264de624 CR3: 0000000037922000 CR4: 00000000003406f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Stack:
> ffff88003d40ba38 0000000000000024 0000000000000000 ffff88003d7bfca0
> ffffffff814c8d92 00000010813ef89d 00000000805ea732 0000000000000009
> 0000000000000024 ffff88003cc39b80 ffff88003d7bfce0 ffffffff814c8f66
> Call Trace:
> [<ffffffff814c8d92>] eoi_pirq+0xb2/0xf0
> [<ffffffff814c8f66>] __startup_pirq+0xe6/0x150
> [<ffffffff814ca659>] xen_irq_resume+0x319/0x360
> [<ffffffff814c7e75>] xen_suspend+0xb5/0x180
> [<ffffffff81120155>] multi_cpu_stop+0xb5/0xe0
> [<ffffffff811200a0>] ? cpu_stop_queue_work+0x80/0x80
> [<ffffffff811203d0>] cpu_stopper_thread+0xb0/0x140
> [<ffffffff810a94e6>] ? finish_task_switch+0x76/0x220
> [<ffffffff810ca731>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
> [<ffffffff810a3935>] smpboot_thread_fn+0x105/0x160
> [<ffffffff810a3830>] ? sort_range+0x30/0x30
> [<ffffffff810a0588>] kthread+0xd8/0xf0
> [<ffffffff810a04b0>] ? kthread_create_on_node+0x1e0/0x1e0
> [<ffffffff8182568f>] ret_from_fork+0x3f/0x70
> [<ffffffff810a04b0>] ? kthread_create_on_node+0x1e0/0x1e0
>
> Signed-off-by: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
although I am not sure I understand why we do eoi_pirq() from
startup_pirq()
-boris
> ---
> drivers/xen/events/events_base.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
> index cb7138c..71d49a9 100644
> --- a/drivers/xen/events/events_base.c
> +++ b/drivers/xen/events/events_base.c
> @@ -487,7 +487,8 @@ static void eoi_pirq(struct irq_data *data)
> if (!VALID_EVTCHN(evtchn))
> return;
>
> - if (unlikely(irqd_is_setaffinity_pending(data))) {
> + if (unlikely(irqd_is_setaffinity_pending(data)) &&
> + likely(!irqd_irq_disabled(data))) {
> int masked = test_and_set_mask(evtchn);
>
> clear_evtchn(evtchn);
> @@ -1370,7 +1371,8 @@ static void ack_dynirq(struct irq_data *data)
> if (!VALID_EVTCHN(evtchn))
> return;
>
> - if (unlikely(irqd_is_setaffinity_pending(data))) {
> + if (unlikely(irqd_is_setaffinity_pending(data)) &&
> + likely(!irqd_irq_disabled(data))) {
> int masked = test_and_set_mask(evtchn);
>
> clear_evtchn(evtchn);