Oops in get_boot_pages at reboot

From: Olof Johansson
Date: Thu Apr 01 2004 - 12:00:57 EST


Hi,

We've started to see Oopses when rebooting some ppc64 systems built with
CONFIG_NUMA after the distribute-early-allocations-across-nodes patch went
in. This happens on the way down at a reboot:

Please stand by while rebooting the system...
md: stopping all md devices.
md: md0 switched to read-only mode.
Oops: Exception in kernel mode, sig: 4 [#1]
SMP NR_CPUS=32 NUMA
NIP: C00000000046D424 XER: 0000000020000000 LR: C0000000000D36CC
REGS: c000000033f8f6b0 TRAP: 0700 Tainted: GF (2.6.5-rc2-ames)
MSR: 8000000000089032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
TASK: c0000000018ff240[1] 'init' THREAD: c000000033f8c000 CPU: 7
GPR00: C0000000000D36CC C000000033F8F930 C000000000605100 00000000000000D0
GPR04: 0000000000000000 C000000033F8FAC0 0000000000000000 0000000000000000
GPR08: C000000033F8FAC0 0000000000001000 C00000003374B680 C000000000603000
GPR12: C0000000078A3BB0 C00000000049E000 0000000000000000 0000000000000400
GPR16: 0000000000000000 0000000000000000 C00000002FEBEB28 C00000002FEBEB18
GPR20: C00000002FEBEB20 C00000002FEBEB08 C00000002FEBEB10 C00000002FEBEB18
GPR24: 000000000000000B 0000000000000000 0000000000000400 C00000003374B680
GPR28: C000000033F8FAC0 C000000033ABFE80 C000000000549008 0000000000000000
NIP [c00000000046d424] .get_boot_pages+0x0/0x1ac
LR [c0000000000d36cc] .__pollwait+0x5c/0x108
Call Trace:
[c0000000000c9984] .pipe_poll+0xc4/0xd0
[c0000000000d3bac] .do_select+0x334/0x400
[c000000000020828] .sys32_select+0x440/0x654
[c000000000020a50] .ppc32_select+0x14/0x28
[c000000000011bdc] .ret_from_syscall_1+0x0/0xa4

So __pollwait() calls __get_free_page(), system_running is 0 so
get_boot_pages is called. Since get_boot_pages is labeled __init, badness
happens.

How about checking against mem_init_done instead of system_running? It
helps against the oops, but there might be some good reason not to do
it. I don't claim to know the intrisic details about the MM. :-)

I'm not sure yet why we don't see this on all kernels at all times. Most
likely seems to be that it's a timing issue with init doing select (it
seems to use it to sleep). Either way, the test of system_running still
seems risky.


Thanks,

Olof

===== mm/page_alloc.c 1.190 vs edited =====
--- 1.190/mm/page_alloc.c Sat Mar 20 05:56:20 2004
+++ edited/mm/page_alloc.c Wed Mar 31 18:22:48 2004
@@ -732,9 +732,10 @@
fastcall unsigned long __get_free_pages(unsigned int gfp_mask, unsigned int order)
{
struct page * page;
-
#ifdef CONFIG_NUMA
- if (unlikely(!system_running))
+ extern int mem_init_done;
+
+ if (unlikely(!mem_init_done))
return get_boot_pages(gfp_mask, order);
#endif
page = alloc_pages(gfp_mask, order);




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/