Re: lock-up with module: Optimize __module_address() using a latched RB-tree

From: Arthur Marsh
Date: Mon Jul 06 2015 - 06:11:48 EST




Peter Zijlstra wrote on 06/07/15 19:34:
On Mon, Jul 06, 2015 at 04:03:45AM +0930, Arthur Marsh wrote:
On this machine, a single core Athlon 64 with a 32 bit current Linus' git
head kernel, I get a lock-up early in the boot process. (A dmesg output of a
successful boot-up of kernel 4.1.0 up to and slightly passed the point where
the git head kernel locks up is attached).

A photo of the lock-up appears at:

http://www.users.on.net/~arthur.marsh/20150706462.jpg

So building a kernel with your .config (and a similar GCC) I was able to
match up the Code: with actual compiler output.

The faulting instruction is a dereference of mod->module_core through:

__modules_address()
mod_find()
latch_tree_find()
mod_tree_comp()
__mod_tree_val()

20e0: 8b 90 44 01 00 00 mov 0x144(%eax),%edx

Now eax is NULL, which will have given you the splat.

The curious thing is that the mod pointer is obtained from the
mod_tree_node structure:

20d0: 8b 46 fc mov -0x4(%esi),%eax

And esi looks like a regular kernel pointer.

Furthermore, we explicitly set the mod pointers in mod_tree_insert()
before linking in the nodes.

This all has the smell of memory corruption, this gfx card you need to
pull, you're not accidentally using a binary driver for that?


No, it's the standard kernel radeon driver.

If it's of any use I can reboot the machine with the kernel that goes boom and the video card removed.

Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/