Re: The buggy APIC of the Abit BP6

From: Helge Hafting (helgehaf@aitel.hist.no)
Date: Thu Jun 13 2002 - 04:05:02 EST


Robbert Kouprie wrote:
>
> Hi all,
>
> First of all, I know the Abit BP6 is infamous about its APIC, but I
> would like to make sure there's absolutely no solution for this except
> disabling the APIC.
>
> I am experiencing problems for a long time now, which are always related
> to the NIC in the box (probably due the being a device that generates a
> lot of interrupts). The NIC has changed a couple of times (from 3com 10
> Mbit to Intel eepro100 to 3Com PCI 3c905B Cyclone 100baseTx now), and
> it's NOT placed in the infamous (I believe 3rd) PCI slot of the board
> (mentioned in the manual). Also, /proc/interrups shows NO sharing with
> another device. The running kernel is 2.4.19-pre8-ac5 SMP, though many
> kernels have preceded it, with the same results.
>
> The problems appear once in a while (in order of days/weeks). They are
> always interluded with an "unexpected IRQ trap at vector 7d", and then
> followed within a minute by chaos in the network driver. I found the
> message of the 3com driver to be the most clear one, see the snippet
> below. When I boot with "noapic", the problems go away.
>
> Is there a solution that does not require disabling the APIC as a whole
> or is this just too flaky hardware?
>
> Thanks in advance,
> - Robbert Kouprie
>
> PS. Please CC me in answers, as I'm not on the list.
>
> Jun 12 23:47:56 radium kernel: unexpected IRQ trap at vector 7d
> Jun 12 23:47:56 radium kernel: unexpected IRQ trap at vector 7d

It _can_ be solved - rebooting cures it, so assuming the problem
is autodetectable it _can_ be solved by doing whatever it is
a reboot (or driver reload) does to the APIC.

My guess is that the APIC setup for that IRQ have to be reprogrammed.
you could do that as a quirk for the BP6.
The first question is if there is a reliable way to detect this
condition. "No interrupts from a device" could simply mean that
it isn't used much at the time. You get a unexpected IRQ trap - do
the problem always manifest itself this way?
The second question is if all the PCI card drivers out there
survive a lost interrupt handled outside the driver.
If not, you have to close+reopen the device, and that involves
userspace.
A network card will need reinitialization, a disk controller
remounting...

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jun 15 2002 - 22:00:27 EST