Re: Device hang when offlining a CPU due to IRQ misrouting

From: Darrick J. Wong
Date: Mon Jun 18 2007 - 18:37:29 EST


On Thu, Jun 07, 2007 at 05:57:26PM -0700, Siddha, Suresh B wrote:

> As you have the failing system, you need to do more detective work and
> help me out. Can you try this debug patch and send across the dmesg after the
> bug happens and also can you try different compiler to see if something
> changes..

Hrm, I just updated to -rc5. Interrupts being handled by the IOAPIC
don't suffer from this problem, but MSI interrupts are still affected.
I added a few printks to the kernel to figure out what IRQ affinity
masks were being passed around and saw this:

[ 256.298773] Breaking affinity for irq 4341
[ 256.298774] irq=4341 affinity=2 mask=d
<call to set_affinity>
[ 256.298787] irq=4341 affinity=d
<ethernet on irq 4341 stops working>

I'll keep digging, but at least it appears that the problem has been
shrunk down to something the MSI code.

--D

Attachment: signature.asc
Description: Digital signature