Re: Severe problem with e1000 driver in 2.4.31/32 (at least)

From: Jesse Brandeburg
Date: Thu Feb 16 2006 - 16:47:09 EST


---------- Forwarded message ----------
From: Stephan von Krawczynski <skraw@xxxxxxxxxx>
today we had to experience a bug in e1000 network driver on this type of
network card (a current PCI e1000 sold everywhere):

02:07.0 Ethernet controller: Intel Corp.: Unknown device 107c (rev 05)
Subsystem: Intel Corp.: Unknown device 1376
Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 18
Memory at f9000000 (32-bit, non-prefetchable) [size=128K]
Memory at f9020000 (32-bit, non-prefetchable) [size=128K]
I/O ports at a800 [size=64]
Expansion ROM at <unassigned> [disabled] [size=128K]
Capabilities: [dc] Power Management version 2
Capabilities: [e4] PCI-X non-bridge device.

We can simply crash the box (2.4.32 stock kernel, 2.4.31 is the same) by
performing this:

1) Boot box with network physically disconnected
2) start pinging some host somewhere, you get "unreachable", let it run
3) connect the network and await some ping replies.
4) disconnect the network again
5) box is dead

The box runs into a BUG in e1000_hw.c line 5052. The BUG shows up because the
code is obviously executed inside an interrupt, which seems not intended.
As this BUG is always reproducable and pretty annoying we made this pretty bad
workaround:

Please try this patch, compile tested. It matches up this particular code to what is currently in 2.6.16-rc

e1000: fix BUG reported due to calling msec_delay in irq context

There are some functions that are called in irq context that need to use
msec_delay_irq instead to avoid a BUG.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@xxxxxxxxx>

---

drivers/net/e1000/e1000_hw.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/e1000/e1000_hw.c b/drivers/net/e1000/e1000_hw.c
--- a/drivers/net/e1000/e1000_hw.c
+++ b/drivers/net/e1000/e1000_hw.c
@@ -5049,7 +5049,7 @@ e1000_config_dsp_after_link_change(struc
if(ret_val)
return ret_val;

- msec_delay(20);
+ msec_delay_irq(20);

ret_val = e1000_write_phy_reg(hw, 0x0000,
IGP01E1000_IEEE_FORCE_GIGA);
@@ -5073,7 +5073,7 @@ e1000_config_dsp_after_link_change(struc
if(ret_val)
return ret_val;

- msec_delay(20);
+ msec_delay_irq(20);

/* Now enable the transmitter */
ret_val = e1000_write_phy_reg(hw, 0x2F5B, phy_saved_data);
@@ -5098,7 +5098,7 @@ e1000_config_dsp_after_link_change(struc
if(ret_val)
return ret_val;

- msec_delay(20);
+ msec_delay_irq(20);

ret_val = e1000_write_phy_reg(hw, 0x0000,
IGP01E1000_IEEE_FORCE_GIGA);
@@ -5114,7 +5114,7 @@ e1000_config_dsp_after_link_change(struc
if(ret_val)
return ret_val;

- msec_delay(20);
+ msec_delay_irq(20);

/* Now enable the transmitter */
ret_val = e1000_write_phy_reg(hw, 0x2F5B, phy_saved_data);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/