Re: Network problem with 2.6.18-mm1 ?

From: Eric W. Biederman
Date: Fri Sep 29 2006 - 19:32:11 EST


"Jesse Brandeburg" <jesse.brandeburg@xxxxxxxxx> writes:

> On 9/28/06, Sukadev Bhattiprolu <sukadev@xxxxxxxxxx> wrote:
>> $ cat /proc/interrupts
>>
>> CPU0 CPU1
>> 28: 0 0 IO-APIC-fasteoi eth0
>> NMI: 96 35
>> LOC: 18251 18524
>> ERR: 0
>
> you should be getting an interrupt every two seconds from the eth0
> (e1000) driver. You are having interrupt delivery problems probably
> due to something screwing up interrupt routing in the kernel.
> Normally these issues are associated with MSI interrupts but your
> adapter doesn't support those and is using generic IRQ
>
> I'm guessing that if you somehow enable interrupts on your vga card on
> the same bus as e1000 (bus 3) it will have interrupt delivery problems
> as well. Maybe try xorg?

To summarize.

We have an e1000 plugged into a pci-x slot on an Opteron system with
an amd chipset.

That motherboard has 3 ioapics. (One on each PCI-X bridge and
one on the 8111 for handling everything else.

We know we are getting interrupts through the 8111 ioapic.

We don't know which ioapic the pci-x bus is hooked to.

So either the ioapics on the 8131 are having problems.
Or we have a problem parsing the irq routing tables.

We see in dmesg.
[ 0.000000] I/O APIC #2 at 0xFEC00000.
[ 0.000000] I/O APIC #3 at 0xE8000000.
[ 0.000000] I/O APIC #4 at 0xE8001000.
...
[ 97.410411] PCI: Cannot allocate resource region 0 of device 0000:00:0a.1
[ 97.423945] PCI: Cannot allocate resource region 0 of device 0000:00:0b.1

We see in lspci
>
> 0000:00:0a.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01)
> (prog-if 10 [IO-APIC])
> Subsystem: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B-
> Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
> <TAbort- <MAbort- >SERR- <PERR-
> Latency: 0
> Region 0: Memory at 88100000 (64-bit, non-prefetchable) [size=4K]
>
> 0000:00:0b.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01)
> (prog-if 10 [IO-APIC])
> Subsystem: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B-
> Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
> <TAbort- <MAbort- >SERR- <PERR-
> Latency: 0
> Region 0: Memory at 88101000 (64-bit, non-prefetchable) [size=4K]

So it looks like the kernel moved the ioapics.

The following patch in 2.6.18-mm1 is known to have that effect.
x86_64-mm-insert-ioapics-and-local-apic-into-resource-map

Can you please try reverting that one patch?

There is a fix an updated version of that patch I think in -mm2
but I haven't had a chance to see if it fixes the problem yet.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/