Re: v1.3.32 Oopses

really kuznet@ms2.inr.ac.ru (inr-linux-kernel@ms2.inr.ac.ru)
Wed, 11 Oct 1995 19:18:24 +0300 (MSK)


> Peter K: "v1.3.32 Oopses" (Oct 9, 15:56):
> > Hi, gang
> >
> > I have run into several (well, uh, about 10 in the space of about 6 hours)
> > kernel oopses just this morning after applying 1.3.32 complainin about
> > not being able to free unallocated kernel pages (?)
> >
> > My config : P5-133, 32 Mb, 256 Kb, 2 x 1 Gb SCSI II's, PCI NCR53c810, 8
> > bit SoundBlaster, ISA NE2000 (Accton), ISA USR 14400, PCI Phoenix S3
> > Trio64 running an ELF kernel.
> >
> > This happens regularly, so much so that I want to return to 1.3.31 (code
> > named 'OJ', innit ? :) Last time bleading-linux was this unstable (for me)
> > was in the 1.1.5x's, I think.
>
> Could you please double-check that you didn't get any rejects or that
> the kernel compilation went ok? The 1.3.32 patches are pretty much
> totally insignificant, with the exception of the NFS code. So if 1.3.31
> was stable for you, 1.3.32 _should_ also be ok.
>
> Do you NFS-mount anything? If you don't, I'd really like you to try to
> do a "make clean" and re-make your kernel just to make sure it's not
> something like that. Because I can't see anything in the code that
> should make any difference..
>
> Linus
>

Linus!

You should not search for this bug in 1.3.32, it presents in all
versions >= 1.3.24 (may be even earlier). It shows up randomly, depending
on configured drivers(?), kernel memory layout etc.

We have 4 different Linux boxes and saw similar oops in free_pages
for 1.3.24,28,30,31 (other versions was not run for enough long time)
but always only on one box! The kernels work for 3 other ones!

I suspected 'TCP cache mismatch' (it really results in random memory
corruption). Alas, I've seen this oops without 'TCP ...' in syslog.

A.N.Kuznetsov