irq0 stops working
From: Vasily Averin
Date: Tue Oct 09 2007 - 01:27:34 EST
On one of our servers timer interrupts (i.e irq0) are stops working. As result
any kernel timers do not triggers and tasks waiting some signals from timers
hangs forever.
Most noticeable effect of this situation is that any write operations to disk
are stalled, and nobody can log in on the node.
At the same time node all existing shells works away. I'm able to read
interrupts statistic from /proc/interrupts file and it shows that all other
interrupts are changed when these devices are accessed: disk on sata controller,
network, cdrom on ide controller, keyboard, serial console, LOC interrupts.
Also I've found that disable of irqbalance service on the node helps to
workaround this issue, however of course it fixes nothing.
All details about hardware/logs could be found in
http://bugzilla.kernel.org/show_bug.cgi?id=8650
I'm able to reproduce this situation, however now I have no ideas how to
continue the investigation of this problem.
Could please anybody advise me any new ways for investigation of this issue?
Thank you,
Vasily Averin
OpenVZ Linux Kernel Team
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/