Debugging system hangs

Randy Gobbel (gobbel@cogsci.ucsd.edu)
Fri, 13 Sep 1996 15:32:20 -0700


Is there any way to get any information out of a hung system? A core dump,
register state--anything! I've been experiencing hangs under various
conditions from the time I first brought up my system, just a few weeks ago.
I started with Debian 1.1 (kernel 2.0.0), and have followed the kernel patches
since then, but nothing has helped.

I have a medium-size neural network simulation that reliably hangs after
running for about twenty minutes. It beats on the X server pretty hard, but I
have another program (Crack 4.1) that also hangs up after about the same
amount of time, without using X at all. About the only thing I can think of
that these applications have in common is that they both crunch for long
periods of time, do lots of little tiny file writes while running, and also
open and close files thousands of times in a run. Without some way of
preserving the carcass of the crashed system, that's about as detailed as I
can get, unfortunately.

If you know of any clever ways to debug system hangs, please let me know. The
machine is a PPro 200, SCSI disk & CDROM, Adaptec 2940-UW SCSI adapter, #9
Imagine 128-2, 3Com 3C590 Ethernet card. Currently running 2.0.19, no more or
less broken than previous versions since 2.0.0.

-Randy

-- 
http://cogsci.ucsd.edu/~gobbel/

NOTICE: I DO NOT ACCEPT UNSOLICITED COMMERCIAL EMAIL MESSAGES OF ANY KIND. I CONSIDER SUCH MESSAGES PERSONAL HARRASSMENT AND A GROSS INVASION OF MY PRIVACY. By sending unsolicited commercial advertising/solicitations (or otherwise on or as part of a mailing list) to me via e-mail you will be indicating your consent to paying John R. (Randy) Gobbel $1,000.00 U.S.D./hour for a minimum of 1 hour for my time spent dealing with it. Payment due in 30 days upon receipt of an invoice (e-mail or regular mail) from me or my authorized representative.