Re: Finding mysterious 2.0.33 crashes -- Please help

Patrick D. Wildi (pwildi@wildi.com)
Fri, 13 Feb 1998 18:19:18 -0800 (PST)


On Fri, 13 Feb 1998, G. Sumner Hayes wrote:

> Has anyone had mysterious hangups in 2.0.33 (total hang, no messages in
> logs, no oops) who is _not_ using an Adaptec SCSI driver? It could help
> narrow the range of code to be checked if that's a common factor in all
> these cases. Also, has anyone had this problem who is _not_ using an
> NE2000 ethernet card? Who is _not_ using a PS/2 mouse? Who is _not_
> using glibc?

To repeat my configuration:
Adaptec SCSI (aic7xxx driver, now using version 5.0.5)
3Com 3c905 card
libc 5.4.33
PS/2 mouse

> Let's try to figure out the commonalities of our systems. My first
> suspicion is the 2940 driver.

That one we have in common.

> When this happens, there is no panic on the console -- I've tried not
> running X and turning off the screen blanker to make sure.

Same here.

> Further comments on this below...
>
> On Fri, Feb 13, 1998 at 10:06:29AM +0000, Klaus Lichtenwalder wrote:
> > On Mon, 12 Feb 1996, Jon Torrez wrote:
> >
> > > (in responce to my previous note)
> > >
> > > Yes! (read on fool :) )
> > >
> > > Well, try this Patrick:
> > > 1. remove all the patches and run from a clean unpatched kernel;
> > > unless your system needs them.
> > >
> > > 2. do remote logging with syslogd do *.*, something might show up.
> > >
> > > 3. cross your fingers.
> > >
> >
> > Well, I have to chime in. I'm also administering a web server
> > remotely, 2.0.33 with Solar Designers security patches, a bunch
> > of isdn interfaces, ethernet to internet backbone, 24 ethernet
> > aliases. This machine tends to just lock, after 3 days of uptime. Just
> > now is updated to Doug's latest adaptec driver, in case it's a scsi
> > timeout on the disk /var is mounted.
> >
>
> I've also had problems with 2.0.33 locking up. It's frustrating that
> there hasn't been a response from the kernel people, but it's quite
> understandable that without anything to go on this could be nearly
> impossible to trace. 2.0.29 (and 1.2.13 and various 1.3.x kernels)
> stayed up fine for me, I do have a 2940U and an NE2000. The machine has
> been running Linux for over a year without glitches until this.
>
> There are no messages on console or in the logs when this happens; it's
> a complete system freeze and only the reset button will fix it. Vanilla
> 2.0.33. I had problems when using both gcc 2.7.2.3 and egcs-1.0.1. It
> tended to happen about once a week.
>
> Curiously, the problem seems to be in remission. My machine was up
> for 11 days without a glitch; I rebooted it after upgrading to Doug
> Ledford's most recent aic7xxx driver (5.0.5, I believe...) and it's
> been up another 4 days since then (making at least 2 weeks since I've
> had a problem). I'd like to help find the problem if it isn't in the
> aic7xxx driver or if it is still present in the newest driver, but I
> don't really have any idea how to find it without any oopses or log
> messages.
>
> PPro 180, 64MB EDO
> Adaptec 2940U (AIC 7881U on-board controller)
> ISA NE2000 clone (which has worked fine for a long time)
> glibc 2.0.5c (I've upgraded to 2.0.6 recently)
> PS/2 Mouse
>
>
> -Sumner
>
> --
> rage, rage against the dying of the light
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
>

---------------------------------------------------------
Patrick Wildi patrick@wildi.com http://www.wildi.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu