nasty crashes with 2.0, no panics - how to debug?

Grant Taylor (gtaylor@picante.com)
Wed, 12 Jun 1996 15:42:22 -0400


I thought Linux *was* a kitchen sink until I saw some of the ideas
here for 2.1 ;)

Anywho:

I took the plunge and booted my systems under 2.0, but it's turned out
poorly. My system is a redhat-3.0.3 based system with the usual
updates to handle 2.0*.

It's crashed several times since Sunday. The first time it continued
to limp along saying:

Jun 11 19:46:00 pace kernel: Problem: block on freelist at 0043b40c isn't free.

every minute. Poor bash couldn't fork to exec anything for lack of
memory, and it was just a big mess.

The second two times it's died much worse, spewing data to the console
and being totally unusable. The data it spews looks like a dump of
some sort: [xx xx xx xx xx] [xx xx xx xx xx], ad nauseum.

My other machine died last night as well, but its console was
screen-saving and wouldn't go on, so I couldn't see whatever it said.

Did anyone ever put in kernel crash dumps of some sort so I can figure
this out? Without even a regular panic message, I'm unable to find
why it's dead. It seems to crash more frequently when traffic to the
w95 box picks up, but that's mostly just a feeling, and the
non-routing machine also crashes.

The following things are happening on my systems and network. Do any
of these things tickle known bugs in 2.0?

- 1 unpatched Win95 box on the network routed and samba'd by Linux.
- 1 ne2000 clone, driver not a module, in each.
- Both the plain ISA system and the VL/ISA one crash.
- 16M or 8M with the 16M limit specified
- Kernel compiled for 486. Crashes on both Intel and AMD 486s.
- Both linux systems mount nfs from each other, nfs is a module.
- kerneld, etc from modules-1.3.69k

The following things are interesting but probably not it since they
happen on only one of the two crash-prone systems (albeit the more
crash-prone of the two):

- PPP connection (to a Cisco of unknown rev) via modem on an STB 4COM.
- 1 each of disk, CD, and tape on an aha1542. non-scsi box dies too.
- IP accounting and firewalling; ~5-15 rules per chain.

Needless to day, neither system crashes under 1.2.13 w/ELF and
whatever other patches redhat put in.

* "The usual updates to rh-3.0.03 for 2.0" is defined as:

SysVinit-2.62-1.i386.rpm
ipfwadm-2.1-1.i386.rpm
modules-1.3.69k-1.i386.rpm
ppp-2.2.0f-1.i386.rpm
procps-0.99a-3.i386.rpm

My .config is http://www.picante.com/~gtaylor/config.txt

-- 
Grant Taylor - gtaylor@picante.com - http://www.picante.com/~gtaylor/
   Must we now speak only of dark meat and light meat on the 'net?

-- 
Grant Taylor - gtaylor@picante.com - http://www.picante.com/~gtaylor/
   Must we now speak only of dark meat and light meat on the 'net?