Re: x86-64 2.6.15-rc2-git5 fails to boot with 4GB memory

From: Matti Aarnio
Date: Tue Nov 29 2005 - 18:52:41 EST


On Tue, Nov 29, 2005 at 07:01:12AM -0700, Andi Kleen wrote:
> Matti Aarnio <matti.aarnio@xxxxxxxxxxx> writes:
>
> > With 2 GB in place, the kernel boots just fine, but with
> > 4 GB, it reports:
>
> Works for me on several machines.
>
> I even have a fix for the Asus wrong MCFG problem now that
> broke the IOMMU on these boards (workaround is pci=nommconf)
>
> >
> > kernel direct mapping tables upto ffff 8101 5000 000 @ 8000-f000
> > PANIC: early exception rip ffff ffff 8016 f002 error 0 cr2 4230
> > PANIC: early exception rip ffff ffff 8011 d1fe error 0 cr2 ffff ffff f5ff d023
> >
> > and some other lines, which I didn't jot down on paper...
>
> Can you please look up the RIP values in your System.map?
>
> > These were copied from some Fedora Core development kernel version
> > after 2.6.15-rc1 (last working one) in a box with 4 GB memory.
>
> Please try vanilla 2.6.15rc2 as a reference at least.

Tried. Crashes with 4 GB memory present in the box.
Boots and runs nicely with 2 GB memory populated in.

After adding -g to *CFLAGS of top-level Makefile, and
trying to determine WHERE those PANICs happened in rc2:

(gdb) list *0xffffffff80163a43
0xffffffff80163a43 is in memmap_init_zone (mm/page_alloc.c:1687).
1682 for (pfn = start_pfn; pfn < end_pfn; pfn++, page++) {
1683 if (!early_pfn_valid(pfn))
1684 continue;
1685 if (!early_pfn_in_nid(pfn, nid))
1686 continue;
1687 page = pfn_to_page(pfn);
1688 set_page_links(page, zone, nid, pfn);
1689 set_page_count(page, 1);
1690 reset_page_mapcount(page);
1691 SetPageReserved(page);

(gdb) list *0xffffffff801196fa
0xffffffff801196fa is in safe_smp_processor_id (include/asm/smp.h:77).
72 #define raw_smp_processor_id() read_pda(cpunumber)
73
74 static inline int hard_smp_processor_id(void)
75 {
76 /* we don't want to mark this access volatile - bad code generation */
77 return GET_APIC_ID(*(unsigned int *)(APIC_BASE+APIC_ID));
78 }
79
80 extern int safe_smp_processor_id(void);
81 extern int __cpu_disable(void);


Not that those explain all that much...


> -Andi

/Matti Aarnio
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/