Re: 2.3.44 bug

From: Rik van Riel (riel@nl.linux.org)
Date: Sun Feb 13 2000 - 12:46:39 EST


On Sun, 13 Feb 2000, Tigran Aivazian wrote:

> I was using 2.3.43 (and all other recent 2.3.x) without observing
> this specific problem so it is something new to 2.3.44. I have not
> narrowed it down yet but the symptom is that I can't start any
> large programs (e.g. X) and can't compile almost anything, gcc
> randomly dies with SIGSEGV.

This problem is not new. 2.3.42 and 2.3.43 have it too.

> The machine is a plain dual PIII (Katmai) 550 on a GA BXD
> motherboard, nothing too exotic - old good slot 1 design.

It's a typical SMP problem, supposedly caused by bugs in
the TLB flush code.

> I will let you know if I find more - I intend to narrow it down
> and then look closer at the changes.

I've narrowed it down a little bit.

$ for i in `seq 50` ; do md5sum <file_bigger_than_ram> ; done

Over NFS this works, gives the same (correct) md5 sum every time.
Now I copy this file to my ext2 /tmp partition and try again.
About one in five md5sums is incorrect ... but the file _write_
has always been correct.

2.3.37 is the last kernel I tried where the error didn't show
up, so I suspect it changed somewhere between 2.3.37 and 2.3.42.

regards,

Rik

--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Feb 15 2000 - 21:00:25 EST