Re: 2.6.21-rc7-mm2 -- x86_64 blade hard hangs

From: Mel Gorman
Date: Thu Apr 26 2007 - 10:46:40 EST


On (26/04/07 14:33), Mel Gorman didst pronounce:
> On (26/04/07 15:51), Andi Kleen didst pronounce:
> > Andy Whitcroft <apw@xxxxxxxxxxxx> writes:
> >
> > > Getting hard boot failures before any kernel output on an x86_64 blade:
> > >
> > > root (hd0,0)
> > > Filesystem type is ext2fs, partition type 0x83
> > > kernel /vmlinuz-autobench ro root=/dev/VolGroup00/LogVol00 rhgb
> > > console=tty0 console=ttyS1,19200 selinux=no autobench_args:
> > > root=30726124 ABAT:1177507960 profile=2
> > > [Linux-bzImage, setup=0x1e00, size=0x21fe48]
> > > initrd /initrd-autobench.img
> > > [Linux-initrd @ 0x37e5f000, 0x190984 bytes]
> > > [silence]
> > >
> > > Kernel config is here:
> > >
> > > http://test.kernel.org/abat/85082/build/dotconfig
> >
> > You should boot with earlyprintk=serial,ttyS1,19200
> >
>
> Here is the console log. Haven't taken a closer look yet in case it's
> blindingly obvious to you.
>
> kernel /vmlinuz-autobench ro root=/dev/VolGroup00/LogVol00 rhgb
> console=tty0 co
> nsole=ttyS1,19200 selinux=no autobench_args: root=30726124
> ABAT:1177594149 logl
> evel=8 earlyprintk=serial,ttyS1,19200
> [Linux-bzImage, setup=0x1e00, size=0x220da8]
> initrd /initrd-autobench.img
> [Linux-initrd @ 0x37e5f000, 0x190982 bytes]
> Linux version 2.6.21-rc7-mm2-autokern1 (root@xxxxxxxxxxxxxxxxxxxxxxxxx)
> (gcc version 4.1.1 20060525 (Red Hat 4.1.1-1)) #1 SMP Thu Apr 26
> 08:16:37 CDT 2007
> Command line: ro root=/dev/VolGroup00/LogVol00 rhgb console=tty0
> console=ttyS1,19200 selinux=no autobench_args: root=30726124
> ABAT:1177594149 loglevel=8 earlyprintk=serial,ttyS1,19200
> BIOS-provided physical RAM map:
> BIOS-e820: 0000000000000000 - 000000000009d400 (usable)
> BIOS-e820: 000000000009d400 - 00000000000a0000 (reserved)
> BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> BIOS-e820: 0000000000100000 - 000000003ffcddc0 (usable)
> BIOS-e820: 000000003ffcddc0 - 000000003ffd0000 (ACPI data)
> BIOS-e820: 000000003ffd0000 - 0000000040000000 (reserved)
> BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> end_pfn_map = 1048576
> kernel direct mapping tables up to 100000000 @ 8000-d000
> DMI 2.3 present.
> ACPI: RSDP 000FDFC0, 0014 (r0 IBM )
> ACPI: RSDT 3FFCFF80, 0034 (r1 IBM SERBLADE 1000 IBM 45444F43)
> ACPI: FACP 3FFCFEC0, 0084 (r2 IBM SERBLADE 1000 IBM 45444F43)
> ACPI: DSDT 3FFCDDC0, 1EA6 (r1 IBM SERBLADE 1000 INTL 2002025)
> ACPI: FACS 3FFCFCC0, 0040
> ACPI: APIC 3FFCFE00, 009C (r1 IBM SERBLADE 1000 IBM 45444F43)
> ACPI: SRAT 3FFCFD40, 0098 (r1 IBM SERBLADE 1000 IBM 45444F43)
> ACPI: HPET 3FFCFD00, 0038 (r1 IBM SERBLADE 1000 IBM 45444F43)
> SRAT: PXM 0 -> APIC 0 -> Node 0
> SRAT: PXM 0 -> APIC 1 -> Node 0
> SRAT: PXM 1 -> APIC 2 -> Node 1
> SRAT: PXM 1 -> APIC 3 -> Node 1
> SRAT: Node 0 PXM 0 0-40000000
> PANIC: early exception rip ffffffff808a7e83 error 0 cr2 5cf0
> Call Trace:
> [<ffffffff808a7e83>] reserve_bootmem_node+0x0/0xc
> [<ffffffff808a3a8c>] reserve_bootmem_generic+0x6d/0x99
> [<ffffffff8089cc00>] setup_arch+0x2f7/0x469
> [<ffffffff80896510>] start_kernel+0x64/0x2aa
> [<ffffffff80896148>] _sinittext+0x148/0x14c
> RIP reserve_bootmem_node+0x0/0xc
>

When I looked closer, nothing seemed to make sense so here is another
log with stack unwinder, frame pointer etc etc

PANIC: early exception rip ffffffff8095a873 error 0 cr2 5cf0

Call Trace:
[<ffffffff8020b174>] dump_trace+0xb6/0x3d8
[<ffffffff8020b4d2>] show_trace+0x3c/0x52
[<ffffffff8020b4fd>] dump_stack+0x15/0x17
[<ffffffff802001e7>]
DWARF2 unwinder stuck at 0xffffffff802001e7
Leftover inexact backtrace:
[<ffffffff8095a873>] reserve_bootmem_node+0x1/0x12
[<ffffffff8096114f>] acpi_table_parse+0x4f/0x69
[<ffffffff80955f91>] reserve_bootmem_generic+0x73/0x9e
[<ffffffff8094edd3>] setup_arch+0x301/0x49c
[<ffffffff8094855f>] start_kernel+0x6c/0x30b
[<ffffffff80948144>] _sinittext+0x144/0x14b

RIP reserve_bootmem_node+0x1/0x12

Will keep kicking.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/