[tip:irq/core] genirq: Fix race on spurious interrupt detection

From: tip-bot for Lukas Wunner
Date: Fri Oct 19 2018 - 11:34:03 EST


Commit-ID: 746a923b863a1065ef77324e1e43f19b1a3eab5c
Gitweb: https://git.kernel.org/tip/746a923b863a1065ef77324e1e43f19b1a3eab5c
Author: Lukas Wunner <lukas@xxxxxxxxx>
AuthorDate: Thu, 18 Oct 2018 15:15:05 +0200
Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
CommitDate: Fri, 19 Oct 2018 17:31:00 +0200

genirq: Fix race on spurious interrupt detection

Commit 1e77d0a1ed74 ("genirq: Sanitize spurious interrupt detection of
threaded irqs") made detection of spurious interrupts work for threaded
handlers by:

a) incrementing a counter every time the thread returns IRQ_HANDLED, and
b) checking whether that counter has increased every time the thread is
woken.

However for oneshot interrupts, the commit unmasks the interrupt before
incrementing the counter. If another interrupt occurs right after
unmasking but before the counter is incremented, that interrupt is
incorrectly considered spurious:

time
| irq_thread()
| irq_thread_fn()
| action->thread_fn()
| irq_finalize_oneshot()
| unmask_threaded_irq() /* interrupt is unmasked */
|
| /* interrupt fires, incorrectly deemed spurious */
|
| atomic_inc(&desc->threads_handled); /* counter is incremented */
v

This is observed with a hi3110 CAN controller receiving data at high volume
(from a separate machine sending with "cangen -g 0 -i -x"): The controller
signals a huge number of interrupts (hundreds of millions per day) and
every second there are about a dozen which are deemed spurious.

In theory with high CPU load and the presence of higher priority tasks, the
number of incorrectly detected spurious interrupts might increase beyond
the 99,900 threshold and cause disablement of the interrupt.

In practice it just increments the spurious interrupt count. But that can
cause people to waste time investigating it over and over.

Fix it by moving the accounting before the invocation of
irq_finalize_oneshot().

[ tglx: Folded change log update ]

Fixes: 1e77d0a1ed74 ("genirq: Sanitize spurious interrupt detection of threaded irqs")
Signed-off-by: Lukas Wunner <lukas@xxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Mathias Duckeck <m.duckeck@xxxxxxxxx>
Cc: Akshay Bhat <akshay.bhat@xxxxxxxxxxx>
Cc: Casey Fitzpatrick <casey.fitzpatrick@xxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # v3.16+
Link: https://lkml.kernel.org/r/1dfd8bbd16163940648045495e3e9698e63b50ad.1539867047.git.lukas@xxxxxxxxx

---
kernel/irq/manage.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index fb86146037a7..9dbdccab3b6a 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -927,6 +927,9 @@ irq_forced_thread_fn(struct irq_desc *desc, struct irqaction *action)

local_bh_disable();
ret = action->thread_fn(action->irq, action->dev_id);
+ if (ret == IRQ_HANDLED)
+ atomic_inc(&desc->threads_handled);
+
irq_finalize_oneshot(desc, action);
local_bh_enable();
return ret;
@@ -943,6 +946,9 @@ static irqreturn_t irq_thread_fn(struct irq_desc *desc,
irqreturn_t ret;

ret = action->thread_fn(action->irq, action->dev_id);
+ if (ret == IRQ_HANDLED)
+ atomic_inc(&desc->threads_handled);
+
irq_finalize_oneshot(desc, action);
return ret;
}
@@ -1020,8 +1026,6 @@ static int irq_thread(void *data)
irq_thread_check_affinity(desc, action);

action_ret = handler_fn(desc, action);
- if (action_ret == IRQ_HANDLED)
- atomic_inc(&desc->threads_handled);
if (action_ret == IRQ_WAKE_THREAD)
irq_wake_secondary(desc, action);