mass "tulip_stop_rxtx() failed", network stops
From: Tomasz Chmielewski
Date: Tue Aug 23 2005 - 04:11:49 EST
We are running almost 20 Fujitsu-Siemens Scenic machines, 2.6.8.1
kernel, equipped with a onboard card that uses a tulip module:
02:01.0 Ethernet controller: Linksys NC100 Network Everywhere Fast
Ethernet 10/100 (rev 11)
No problem with those.
We are running four more machines like that, the only difference is the
kernel they are running (2.6.11.4).
On some of them, there are serious problems with a network, and they
usually happen when the traffic is bigger than usual (i.e., some big
software deployment to several workstations, remote backup, etc.).
The syslog is then full of entries like that:
Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Aug 21 04:04:30 SERVER-B-HS kernel: 0000:00:06.0: tulip_stop_rxtx() failed
and it's filling logs for hours; network doesn't work anymore, and
someone has to restart the network or the machine itself.
It doesn't always happen with a big traffic - sometimes you can fill the
100 Mbit link and do lots of reads from the disk, but nothing bad
happens for hours.
I saw some posts on this issue ("2.6.10-rc3: tulip-driver:
tulip_stop_rxtx() failed"), but it seemed to me that it wasn't similar
to my problems; I looked into >2.6.10 kernel changelog, but there were
no descriptions of that problem, either.
Any help appreciated, because rebooting machines which are 500 km away
and are not responding is no fun :)
--
Tomek
http://wpkg.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/