Re: irq0 stops working

From: Thomas Gleixner
Date: Tue Oct 09 2007 - 02:01:18 EST


On Tue, 9 Oct 2007, Vasily Averin wrote:
> On one of our servers timer interrupts (i.e irq0) are stops working. As result
> any kernel timers do not triggers and tasks waiting some signals from timers
> hangs forever.

Which kernel version ?

> Most noticeable effect of this situation is that any write operations to disk
> are stalled, and nobody can log in on the node.
>
> At the same time node all existing shells works away. I'm able to read
> interrupts statistic from /proc/interrupts file and it shows that all other
> interrupts are changed when these devices are accessed: disk on sata controller,
> network, cdrom on ide controller, keyboard, serial console, LOC interrupts.
>
> Also I've found that disable of irqbalance service on the node helps to
> workaround this issue, however of course it fixes nothing.

Well, it's at least a hint. Can you try the patch below please ?

tglx

diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c
index 6d48a4e..248987a 100644
--- a/arch/x86_64/kernel/time.c
+++ b/arch/x86_64/kernel/time.c
@@ -360,7 +360,7 @@ void stop_timer_interrupt(void)

static struct irqaction irq0 = {
.handler = timer_interrupt,
- .flags = IRQF_DISABLED | IRQF_IRQPOLL,
+ .flags = IRQF_DISABLED | IRQF_IRQPOLL | IRQF_NOBALANCING,
.mask = CPU_MASK_NONE,
.name = "timer"
};
@@ -403,6 +403,7 @@ void __init time_init(void)
cpu_khz / 1000, cpu_khz % 1000);
init_tsc_clocksource();

+ irq0.mask = cpumask_of_cpu(0);
setup_irq(0, &irq0);
}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/