Re: [bisected] 2.6.31 regression: fails to boot as xen guest
From: Pekka Enberg
Date: Tue Aug 25 2009 - 12:29:35 EST
Hi Arnd,
On Tue, Aug 25, 2009 at 6:48 PM, Arnd
Hannemann<hannemann@xxxxxxxxxxxxxxxxxxx> wrote:
> current 2.6.31 fails to boot on our xen host (32bit pae).
> Unfortunately it fails in a way that there is absolutely no output
> on the console. Config is as 32bit guest.
>
> Git bisect gave the following:
>
> 83b519e8b9572c319c8e0c615ee5dd7272856090 is first bad commit
> commit 83b519e8b9572c319c8e0c615ee5dd7272856090
> Author: Pekka Enberg <penberg@xxxxxxxxxxxxxx>
> Date: Wed Jun 10 19:40:04 2009 +0300
>
> slab: setup allocators earlier in the boot sequence
>
> This patch makes kmalloc() available earlier in the boot sequence so we can get
> rid of some bootmem allocations. The bulk of the changes are due to
> kmem_cache_init() being called with interrupts disabled which requires some
> changes to allocator boostrap code.
>
> Note: 32-bit x86 does WP protect test in mem_init() so we must setup traps
> before we call mem_init() during boot as reported by Ingo Molnar:
>
> We have a hard crash in the WP-protect code:
>
> [ 0.000000] Checking if this processor honours the WP bit even in supervisor mode...BUG: Int 14: CR2 ffcff000
> [ 0.000000] EDI 00000188 ESI 00000ac7 EBP c17eaf9c ESP c17eaf8c
> [ 0.000000] EBX 000014e0 EDX 0000000e ECX 01856067 EAX 00000001
> [ 0.000000] err 00000003 EIP c10135b1 CS 00000060 flg 00010002
> [ 0.000000] Stack: c17eafa8 c17fd410 c16747bc c17eafc4 c17fd7e5 000011fd f8616000 c18237cc
> [ 0.000000] 00099800 c17bb000 c17eafec c17f1668 000001c5 c17f1322 c166e039 c1822bf0
> [ 0.000000] c166e033 c153a014 c18237cc 00020800 c17eaff8 c17f106a 00020800 01ba5003
> [ 0.000000] Pid: 0, comm: swapper Not tainted 2.6.30-tip-02161-g7a74539-dirty #52203
> [ 0.000000] Call Trace:
> [ 0.000000] [<c15357c2>] ? printk+0x14/0x16
> [ 0.000000] [<c10135b1>] ? do_test_wp_bit+0x19/0x23
> [ 0.000000] [<c17fd410>] ? test_wp_bit+0x26/0x64
> [ 0.000000] [<c17fd7e5>] ? mem_init+0x1ba/0x1d8
> [ 0.000000] [<c17f1668>] ? start_kernel+0x164/0x2f7
> [ 0.000000] [<c17f1322>] ? unknown_bootoption+0x0/0x19c
> [ 0.000000] [<c17f106a>] ? __init_begin+0x6a/0x6f
>
> Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> Acked-by Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Christoph Lameter <cl@xxxxxxxxxxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxx>
> Cc: Matt Mackall <mpm@xxxxxxxxxxx>
> Cc: Nick Piggin <npiggin@xxxxxxx>
> Cc: Yinghai Lu <yinghai@xxxxxxxxxx>
> Signed-off-by: Pekka Enberg <penberg@xxxxxxxxxxxxxx>
>
> However a
> git revert 83b519e8b9572c319c8e0c615ee5dd7272856090
> to verify that current git without that commit would work,
> didn't succeed right away, so I was not able to test that.
Thanks for doing the bisect! Can we also see your .config also?
I doubt this is a slab allocator initialization issue so I'm CC'ing
some Xen folks. Jeremy, I don't know Xen well but on quick read, the
only thing that I can see is that trap_init() is called before
sched_init() now. I see Xen doing preempt_enable()/preempt_disable so
maybe that's a problem now?
Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/