Re: [BUG, bisect, linux-next] do_IRQ: No irq handler for vector

From: Jeremiah Mahler
Date: Sun Dec 20 2015 - 02:33:57 EST


Jiang Liu,

On Thu, Dec 17, 2015 at 07:40:33PM -0800, Jeremiah Mahler wrote:
> all,
>
> I just started getting these "No irq handler for vector" messages
> after upgrading to linux-next 20151217+.
>
>
> (from the first boot)
> ...
> [ 2.282652] [drm] Initialized drm 1.1.0 20060810
> [ 2.318806] AVX version of gcm_enc/dec engaged.
> [ 2.318810] AES CTR mode by8 optimization enabled
> [ 2.324446] do_IRQ: 0.35 No irq handler for vector
> [ 2.366146] iTCO_vendor_support: vendor-support=0
> [ 2.372762] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11
> ...
> [ 9.249887] wlan0: associate with 2c:5d:93:09:50:48 (try 1/3)
> [ 9.265206] wlan0: RX AssocResp from 2c:5d:93:09:50:48 (capab=0x421 status=0 aid=8)
> [ 9.284088] wlan0: associated
> [ 10.453048] do_IRQ: 0.35 No irq handler for vector
> [ 10.457923] do_IRQ: 0.35 No irq handler for vector
> [ 10.457932] do_IRQ: 0.35 No irq handler for vector
> [ 10.501026] do_IRQ: 0.35 No irq handler for vector
> [ 10.501033] do_IRQ: 0.35 No irq handler for vector
> [ 10.513951] do_IRQ: 0.35 No irq handler for vector
> ...
>
>
> (second boot, and after a resume)
> ...
> [10527.998694] PM: noirq resume of devices complete after 21.488 msecs
> [10527.999578] PM: early resume of devices complete after 0.850 msecs
> [10528.000525] rtc_cmos 00:02: System wakeup disabled by ACPI
> [10528.005265] do_IRQ: 0.84 No irq handler for vector
> [10528.005450] sd 0:0:0:0: [sda] Starting disk
> [10528.021257] tpm_tis 00:05: TPM is disabled/deactivated (0x6)
> ...
> [10530.005541] PM: resume of devices complete after 2005.925 msecs
> [10530.005690] usb 3-1.4:1.0: rebind failed: -517
> [10530.005696] usb 3-1.4:1.1: rebind failed: -517
> [10530.006575] Restarting tasks ...
> [10530.008347] do_IRQ: 0.84 No irq handler for vector
> [10530.021258] done.
> [10530.042883] Bluetooth: hci0: BCM: chip id 63
> ...
> [10559.005603] mei_me 0000:00:16.0: timer: init clients timeout hbm_state = 1.
> [10559.005612] mei_me 0000:00:16.0: unexpected reset: dev_state = INIT_CLIENTS fw status = 1E000245 60000106
> [10559.009508] do_IRQ: 0.84 No irq handler for vector
> [10561.005639] mei_me 0000:00:16.0: wait hw ready failed
> [10561.005644] mei_me 0000:00:16.0: hw_start failed ret = -62
> ...
>
>
> I can test patches if anyone has any ideas :-)
>
> --
> - Jeremiah Mahler

I performed a bisect and found that the following patch introduced the bug,
which is still present in the latest linux-next 20151218+.

From 41c7518a5d14543fa4aa1b5b9994ac26b38c0406 Mon Sep 17 00:00:00 2001
From: Jiang Liu <jiang.liu@xxxxxxxxxxxxxxx>
Date: Mon, 30 Nov 2015 16:09:29 +0800
Subject: [PATCH] x86/irq: Fix a race condition between vector assigning and
cleanup

Joe Lawrence reported an use after release issue related to x86 IRQ
management code. Please refer to the following link for more
information: http://lkml.kernel.org/r/5653B688.4050809@xxxxxxxxxxx

Thomas pointed out that it's caused by a race condition between
__assign_irq_vector() and __send_cleanup_vector(). Based on Thomas'
draft patch, we solve this race condition by:
1) Use move_in_progress to signal that an IRQ cleanup IPI is needed
2) Use old_domain to save old CPU mask for IRQ cleanup
3) Use vector to protect move_in_progress and old_domain

This bugfix patch also helps to get rid of that atomic allocation in
__send_cleanup_vector().

Fixes: a782a7e46bb5 "x86/irq: Store irq descriptor in vector array"
Reported-and-tested-by: Joe Lawrence <joe.lawrence@xxxxxxxxxxx>
Signed-off-by: Jiang Liu <jiang.liu@xxxxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Link: http://lkml.kernel.org/r/1448870970-1461-4-git-send-email-jiang.liu@xxxxxxxxxxxxxxx
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
---
arch/x86/kernel/apic/vector.c | 77 +++++++++++++++++++------------------------
1 file changed, 34 insertions(+), 43 deletions(-)
...

--
- Jeremiah Mahler
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/