Spooky "crashes in parallel"

Chris Evans (chris@ferret.lmh.ox.ac.uk)
Tue, 24 Nov 1998 13:09:44 +0000 (GMT)


Hi,

Here's something I meant to post a while back.

Basically, I suspect a nasty bug in the ne2k driver, in at least 2.0.x. We
used to have the weird situation where two Linux machines with ne2k cards
would hang hard at the same time.

To my knowledge, the cards were slightly different ne2k variants, and
Win95 machines with these cards tended to survive.

This behaviour seemed to start at the same time as storms of bad network
packets. Some router somewhere gone crazed maybe. By "bad" packets we are
talking raw "010101010101" binary on the wire for quite a number of
packets. Hence, perhaps the ne2k driver doesn't drop erroneous packets as
safely as it could.

Of the two machines, one is vital. Since switching to a DEC based
card (and the de4x5 driver) the problems are fixed. But the ne2k based
machine still hangs hard on a regular basis.

We tried the tulip.c driver with the DEC card. Not as catastrophic as the
ne2k, i.e. no hangs. But still, transmit would cease to work after about a
day, a still commonly reported problem for various netcard drivers.

The de4x5 based machine has 25 days flawless uptime. In that time, the
dodgy network has supplied ~10,000 receive errors. There are also ~4,000
transmit errors.

I have just noticed that collisions are _very_ high (10%, ouch). Perhaps
another dodgy area is excessive collision handling.

Cheers
Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/