Re: 2.2.14 SMP 3com905: transmit timed out: Odd lost irq and ip-stack lockup

From: Andrew Morton (andrewm@uow.edu.au)
Date: Thu Oct 12 2000 - 09:33:36 EST


"Dr. Michael Weller" wrote:
>
> Dear list,
>
> I run a Compaq Proliant 1500 (dual Pentium 75.200) with hardware raid
> (Smart2) with two ethernet cards 3com905 (b or c, I can't tell you right
> now) as a firewall and web/mail virus scanner which (needless to say)
> needs to be up 7d/24h.
>
> Recently, during a pretty fast download the machine (ethernet technically,
> you could login on the console, even ping the ethernet ip address) locked
> up with the following error log:
>
> Oct 9 17:29:02 fwintern kernel: eth0: transmit timed out, tx_status
> 00 status e681.

Here is your problem:

> Oct 9 17:29:02 fwintern kernel: eth0: Interrupt posted but not
> delivered -- IRQ blocked by another device?

This is the infamous APIC bug. I have about ten reports of this over a
four-month period. Mark Hemment mentioned it just yesterday.

This is not a 3c59x problem. It is due to the APIC forgetting how to
generate interrupts for a particular IRQ. It happens mostly for NICs
because they generate a lot of interrupts. I've had it happen just
once. In that case, _nothing_ would make the interrupt come back
(including a driver unload/reload).

This gets reported a lot by 3c59x users because this driver specifically
detects and reports on the problem.

Donald Becker says that this is a software bug (I don't know why he
thinks this). He says that he _always_ boots linux with the `noapic'
option to prevent it happening.

> The problem was reproducible (several times) with the same download (a
> 300MB file) after a reboot.

Interesting. So you had a stable 2.2.14 machine which suddenly started
repeatedly exhibiting this problem? Is it still reproducible?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Oct 15 2000 - 21:00:23 EST