Re: lock-up with module: Optimize __module_address() using a latched RB-tree

From: Peter Zijlstra
Date: Mon Jul 06 2015 - 06:04:59 EST


On Mon, Jul 06, 2015 at 04:03:45AM +0930, Arthur Marsh wrote:
> On this machine, a single core Athlon 64 with a 32 bit current Linus' git
> head kernel, I get a lock-up early in the boot process. (A dmesg output of a
> successful boot-up of kernel 4.1.0 up to and slightly passed the point where
> the git head kernel locks up is attached).
>
> A photo of the lock-up appears at:
>
> http://www.users.on.net/~arthur.marsh/20150706462.jpg

So building a kernel with your .config (and a similar GCC) I was able to
match up the Code: with actual compiler output.

The faulting instruction is a dereference of mod->module_core through:

__modules_address()
mod_find()
latch_tree_find()
mod_tree_comp()
__mod_tree_val()

20e0: 8b 90 44 01 00 00 mov 0x144(%eax),%edx

Now eax is NULL, which will have given you the splat.

The curious thing is that the mod pointer is obtained from the
mod_tree_node structure:

20d0: 8b 46 fc mov -0x4(%esi),%eax

And esi looks like a regular kernel pointer.

Furthermore, we explicitly set the mod pointers in mod_tree_insert()
before linking in the nodes.

This all has the smell of memory corruption, this gfx card you need to
pull, you're not accidentally using a binary driver for that?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/