However, I just got a lock up on 1.99.14 (leaving me no option but to bring
up my copy of 2.0) that at least had some unusual characteristics.
I was browsing some documents using netscape (netscape was running under
solaris because I only have 16 meg on my linux box). Because solaris
can't access ncpfs directly, I had apache serving up the subdirectories
of interest. Anyways, on one document retrieve netscape took a long time
(more than a fraction of a second -- many seconds more than a fraction of
a second) to bring up a document. And, sure enough, ls on the top
level ncpfs directory hung.
So, I shut down everything and get ready to reboot, but shutdown doesn't
seem to do anything. After several attempts, it still doesn't do anything.
So, I fire up top and start killing things. I kill a -bash with signal 15
and that doesn't do anything. So, I kill it -9 and top hangs. [Maybe it
was my bash? But another top shows that it's a zombie -- seems rather
strange since the -bash wouldn't have had a parent if it was my bash.]
[I'm reconstructing this from memory, and I didn't look very closely at
parent processes, sorry.]
Anyways, I finally get the idea to kill httpd. And, sure enough, the instant
I kill httpd I get a couple lines that look like they came from printk
about some bad return values (255 and -3? These lines never showed up in
any log). System reboots a few seconds later, and 2.0 come up...
Problem: ncpfs hangs intermittently.
Problem: something trashed a queue or process table or something.
Problem: I haven't a clue what I should be doing to better diagnose this.
[This kind of error only happens after running for a while, with a lot of
activity, so enabling kernel debugging seems rather futile. Normally,
however, it doesn't fail so horribly.]
Suggestions anyone?
-- Raul