Re: Dual opteron various segfaults with 2.6.14.2 and earlier kernels

From: Andrea Arcangeli
Date: Wed Nov 23 2005 - 19:03:34 EST


Hello Fabio,

On Thu, Nov 24, 2005 at 12:26:41AM +0100, Fabio Coatti wrote:
> yes, uname says 2.6.14.2; on a second identical machine, I've just seen this:
>
>
> factorial[2352]: segfault at 0000000000020f31 rip 00000000004035ae rsp
> 00007fffffbfaf60 error 4
> factorial[2354]: segfault at 0000000000020f31 rip 00000000004035ae rsp
> 00007fffffe3fc70 error 4
> factorial[2361]: segfault at 0000000000020f31 rip 00000000004035ae rsp
> 00007fffffb07c50 error 4
> factorial[2358]: segfault at 0000000000020f31 rip 00000000004035ae rsp
> 00007fffffb07c50 error 4
> factorial[2363]: segfault at 0000000000020f31 rip 00000000004035ae rsp
> 00007fffffe6d270 error 4
>
> the kernel and HW are the same.

Error 4 means a read in userland on a not mapped area.

The above isn't necessairly a kernel or hardware problem, it looks like
an userland bug if it segfaults at such a low address (20f31). Nothig is
mapped below "0x400000" exactly to catch these kind of bugs.

You should debug the program and check what's the code at address
0x4035ae? You can check it with gdb or objdump -d. Probably there's a
64bit bug in the program that doesn't trigger on x86 32bit (or you may
not be noticing the segfault on 32bits because it wouldn't be logged in
the syslog).

Hope this helps ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/