Re: Weird problem with 2.4.4-pre6

From: Andrew Morton (andrewm@uow.edu.au)
Date: Wed Apr 25 2001 - 20:10:16 EST


Tobias Ringstrom wrote:
>
> Yesterday, I was running tcpdump, paging the output with less. All of a
> sudden, less started to dump core (SIGSEGV). I could not even start less
> by itself:
>
> > less
>
> without it getting a SIGSEGV, and in fact no user could run less without
> getting a SIGSEGV, but it did work perfectly a few minutes earlier. This
> morning, I tried to run less again, and now it was working! No core
> dumps!
>
> How can this happen? Something overwriting the page/buffer cache?

Yes. Something scribbled on the pagecache, most likely.

If this happens, take a copy of the offending binary and all its shared
libraries - simply copy them into a temp directory. The corrupted version
will be written to disk, from the pagecache. Make sure you keep
a copy of the offending vmlinux as well for looking things up in.

Then reboot and start diffing things; the differences can provide
clues. If the diffs show single-bit errors then it's a RAM
problem. If the diffs look like pointers into kernel space
then look 'em up in vmlinux and shout loudly. etc.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Apr 30 2001 - 21:00:14 EST