Re: 4.4.1 regression from 4.1.x: Soekris net5501 crash in IRQ after mfgpt timer initialization
From: Thomas Gleixner
Date: Tue Feb 02 2016 - 09:48:38 EST
On Tue, 2 Feb 2016, Nix wrote:
> [Cc:ed Thomas on the vague hope that maybe this is osmething to do with
> the IRQ subsystem in general, though I doubt it, since only the one
> machine is crashing for me: it's probably the CS5531's interactions
> with said subsystem at fault.]
Kinda. That driver does the following:
setup the irq in CS5531
request the interrupt to install the handler
register the clockevents device
> [ 1.589543] cs5535-clockevt: Registering MFGPT timer as a clock event, using IRQ 7
> [ 1.604921] BUG: unable to handle kernel NULL pointer dereference at (null)
> [ 1.605101] [<c02cf141>] ? mfgpt_tick+0x6e/0x77
...
> [ 1.605339] [<c039c1a9>] common_interrupt+0x29/0x30
...
> [ 1.605394] [<c013e4e7>] vprintk_emit+0x2b4/0x2be
> [ 1.605426] [<c013e5f2>] vprintk_default+0x12/0x14
> [ 1.605446] [<c0164cd6>] printk+0x11/0x13
> [ 1.605462] [<c04b625a>] cs5535_mfgpt_init+0xce/0xf1
So the interrupt hits before the clockevent device is registered and the event
handler is installed. So mfgpt_tick() will happily call a null pointer.
The patch below should fix^Wwork around the issue.
Thanks,
tglx
8<----------------
--- a/drivers/clocksource/cs5535-clockevt.c
+++ b/drivers/clocksource/cs5535-clockevt.c
@@ -117,7 +117,8 @@ static irqreturn_t mfgpt_tick(int irq, void *dev_id)
/* Turn off the clock (and clear the event) */
disable_timer(cs5535_event_clock);
- if (clockevent_state_shutdown(&cs5535_clockevent))
+ if (clockevent_state_shutdown(&cs5535_clockevent) ||
+ clockevent_state_detached(&cs5535_clockevent))
return IRQ_HANDLED;
/* Clear the counter */