Kernel lockups with 2.0.3x

Mike A. Harris (mharris@ican.net)
Sat, 25 Jul 1998 18:27:17 -0400 (EDT)


I am having aggravation extrodinaire from a lockup problem right
now, and have no idea where to even begin tracking the problem
down.

I am using Kernel 2.0.35 right now, but the same trouble happens
whatever kernel I'm using, at least 2.0.3x anyways. I am using
a stock 2.0.35 kernel, and have SCSI emulation enabled so I can
use my CD writer. I also have IP masq compiled in, and all the
necessary stuff for that, and PLIP to work, etc... I've been
using this setup for quite some time now with no trouble.

I can leave my system up for days, even weeks on end with no
trouble, however, once I start up X, that is when the trouble
starts - well... maybe.. maybe not... I have used X for 3 days
straight and not had a lockup or screwup. But now and then, I
get a number of freezups. At first, they were fixable by
shutting down X and restarting, now my system goes totally dead.

My screen screws up the display by shifting everything to the
left or right and/or blacks out part of the screen, and freezes
solid. No response from the keyboard, or anything else. A
background file copy from mc on a console, also stopped making
noise. I must assume that the machine totally locked. There is
nothing written about the incident in any of the files under
/var/log.

Also, just prior to that, I was using "mc" on a console, and it
wasn't functioning properly. I was in one FTP site (site.a.com)
and then disconnected from there and connected to site 2
(site.b.com), then I went in a few directories deep, and then
when I hit ".." to go back a dir, it said at the bottom of the
screen, "connecting to site.a.com", "reading dir", then it
displayed site *2*'s directory? I don't know if this is an mc
bug or a kernel bug, but it was very strange. Then a few seconds
later, mc failed to work properly, and I had to kill it from
another VC. After I killed it, I could not fire it up again.
When I typed mc, I would get a black clear screen, with a spinner
in the upper right, then nothing. I had to kill it again from
another vc. I killed it, went in /tmp, deleted all mc temp
files, and tried again - nothing. I figured that I'd try
rebooting and restarting "mc" (ala windows95) as that has worked
before for some strange reason. I figured I'd check my telnet
session in X first and log out. When I switched to X, that is
when the screen twisted, displayed a screwed up KDE desktop, and
then locked hard.

I tried to telnet in from my 486 over PLIP, and couldn't connect.
No ping either.

I was very upset, as I lost a great deal of work that I was
working on. No drive corruption luckily (at least not ext2, I
haven't tested my fat32 or FAT drives yet).

I realize that this is a problem with either X, the kernel, both,
perhaps even a hardware problem. I'd like to do whatever I can
to try and help someone (Alan?) pinpoint what is causeing this
problem. If someone could give me a troubleshooting checklist,
or list of programs to run in sequence or parallel, or whatever,
I'd be more than happy to try and lock my machine in a replicable
way to pinpoint the trouble. It would be time well spent, thats
for sure.

I'm becoming *scared* of doing serious work in X now, or even if
X is just running, because it might lock up Linux. That was
unheard of to me a year ago.

Any suggestions, comments, flames? Feel free.

Take care everyone, and keep up the great work! Long live linux!

--
Mike A. Harris  -  Computer Consultant  -  Linux advocate

Escape from the confines of Microsoft's operating systems and push your PC to it's limits with LINUX - a real OS. http://www.redhat.com

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html