Hendrik Visage (Hendrik.Visage@VECTOR.CO.ZA)
Wed, 4 Oct 1995 08:51:51 +0200 (SAT)

Hi there list,

> Has anyone had experience with the "Linux Sudden Death Syndrome?"

noticed it when I moved to ELF from a Slackware distribution & kernel 1.32[3-9]
with a ne2000 clone, but my problem were possibly related to a machine which
directories I NFS mounted, which network cable was faulty. It cause linux (or
other Solaris 2.x machines) to sometimes "lock up" for up to 5 minutes after
which it then recovers. Linux I noticed also had the waiting problem, but
sometimes also hanged. I then put up a keyboard & screen (I were using it over
the net ;^) and noticed that the times I started a BIG (>20MB) ftp/nfs copy, the
machines networking would stop, althought the VT's and other programs can still
work for a while, and then everything stop dead in its tracks.

> Symptoms include:
> 1) Hanging without warning, apparently randomly. can be anywhere from a few
> hours to up to 10 days.
> 2) No panic information.. nada.. system just stops. no pings, etc.
> 3) No VC switching works, but can scroll up/down within the current VC with
> shift-PgUp/shift-PgDn, so apparently SOMETHING is still working.
> Does anyone know causes and perhaps cures for this mysterious syndrome?

Possible cures (Which happened all at the same time):
1) replaced a faulty UTP (10base-T) cable to the server
2) Installed RedHat-2 with full ELF support
3) Compiled 1.3.30

Currently the machine is up for 2 days...
*Networking might need some debugging?? with errors on the lines??
(No I don't have a problem with the net code, but it might be that errors and
retransmissions might be a problem, which is especially difficult to simulated)
*SOme network adapters isn't quite good??

