Re: [PATCH] Fix crash with irqpoll due to the IRQF_IRQPOLL flag

From: Linus Torvalds
Date: Wed May 23 2007 - 12:01:25 EST




On Tue, 22 May 2007, Bernhard Walle wrote:
>
> o System crashes if booted with irqpoll command line option.
>
> o Problem happens because Inside note_interrupt() we are accessing
> desc->action->flag without taking the desc->lock. While accessing it
> somebody goes ahead and unregisters the irq handler hence desc->action
> is NULL. By the time note_interrupt() checks it, it crashes.

I absolutely _detest_ patches that make already complex and unreadable
code even more so. Especially conditionals. Please don't do that.

If you need a variable for a conditional, make it be an implicit one from
an inline function, and aim for making it readable.

So how about instead writing it out as a nice self-explanatory inline
function? I can almost guarantee that this generates no worse code, and it
also makes it easy to explain things like "we don't bother with the lock,
because we don't care enough".

Untested, but I think the point of the patch is obvious. Anybody want to
test it, send it back to me, and fix the bug while making the code more
readable?

I think we should instate a hard new rule:

- if a bugfix doesn't make the code more readable, it's not a bugfix at
all.

There was a reason for the bug in the first place, and that reason is
likely that the code was hard to think about. So making it even _harder_
to think about isn't really fixing the deeper problem!

Linus

---
kernel/irq/spurious.c | 46 +++++++++++++++++++++++++++++++++++++---------
1 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/kernel/irq/spurious.c b/kernel/irq/spurious.c
index b0d81aa..bd9e272 100644
--- a/kernel/irq/spurious.c
+++ b/kernel/irq/spurious.c
@@ -135,6 +135,39 @@ report_bad_irq(unsigned int irq, struct irq_desc *desc, irqreturn_t action_ret)
}
}

+static inline int try_misrouted_irq(unsigned int irq, struct irq_desc *desc, irqreturn_t action_ret)
+{
+ struct irqaction *action;
+
+ if (!irqfixup)
+ return 0;
+
+ /* We didn't actually handle the IRQ - see if it was misrouted? */
+ if (action_ret == IRQ_NONE)
+ return 1;
+
+ /*
+ * But for 'irqfixup == 2' we also do it for handled interrupts if
+ * they are marked as IRQF_IRQPOLL (or for irq zero, which is the
+ * traditional PC timer interrupt.. Legacy)
+ */
+ if (irqfixup < 2)
+ return 0;
+
+ if (!irq)
+ return 1;
+
+ /*
+ * Since we don't get the descriptor lock, "action" can
+ * change under us. We don't really care, but we don't
+ * want to follow a NULL pointer. So tell the compiler to
+ * just load it once by using a barrier.
+ */
+ action = desc->action;
+ barrier();
+ return action && (action->flags & IRQF_IRQPOLL);
+}
+
void note_interrupt(unsigned int irq, struct irq_desc *desc,
irqreturn_t action_ret)
{
@@ -144,15 +177,10 @@ void note_interrupt(unsigned int irq, struct irq_desc *desc,
report_bad_irq(irq, desc, action_ret);
}

- if (unlikely(irqfixup)) {
- /* Don't punish working computers */
- if ((irqfixup == 2 && ((irq == 0) ||
- (desc->action->flags & IRQF_IRQPOLL))) ||
- action_ret == IRQ_NONE) {
- int ok = misrouted_irq(irq);
- if (action_ret == IRQ_NONE)
- desc->irqs_unhandled -= ok;
- }
+ if (unlikely(try_misrouted_irq(irq, desc, action_ret))) {
+ int ok = misrouted_irq(irq);
+ if (action_ret == IRQ_NONE)
+ desc->irqs_unhandled -= ok;
}

desc->irq_count++;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/