Re: Kernel-oops...

Ingo Molnar (mingo@chiara.csoma.elte.hu)
Mon, 17 May 1999 13:50:43 +0200 (CEST)


On Sat, 15 May 1999, Arnim von Helmolt wrote:

> Unable to handle kernel paging request at virtual address 00020000
> EIP: 0010:[find_buffer+40/76]
> eax: 00020000

> Unable to handle kernel paging request at virtual address 00020000
> eax: 00020000

that piece of code expects to see a NULL pointer there, but for some
reason it got 0x00020000. (which is exactly one bit away from NULL)

i have often seen flaky RAM and/or overclocked/overheated CPUs triggering
the above message. find_buffer() does a rather unpredictable and sometimes
long scan through various kernel structures, kicking out cachelines - this
appears to trigger RAM/CPU bit errors frequently. It might still be a
kernel bug of some sort, but if this is the main type of oops then i
really cant see anything else corrupting exactly one bit in that field.
i'd double-check the RAM, CPU cooler and CPU itself, plus BIOS timings.
Alternatively DMA mis-writes could cause such symptoms as well, although
those tend to show up in more violent ways. 2.2 kernels hit the CPU harder
than 2.0 kernels (ie. they do more stuff in less time, thus putting more
load on the hardware).

-- mingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/