Re: [BUG] 2.6.26-rc1 lost half the RAM on UltraSPARC 5

From: David Miller
Date: Mon May 12 2008 - 18:31:59 EST


From: Mikael Pettersson <mikpe@xxxxxxxx>
Date: Mon, 12 May 2008 21:06:41 +0200

> David Miller writes:
> > Please also add the debugging patch below.
>
> Right, 2.6.26-rc2 plus your debugging patch and booted with numa=debug
> prints the following:

Hmmm... my debugging patch had this:

diff --git a/lib/lmb.c b/lib/lmb.c
index 83287d3..3f55973 100644
--- a/lib/lmb.c
+++ b/lib/lmb.c
...
+#define DEBUG
+

which should have resulted in lmb_dump_all() printing some userful
debugging messages. I validated that it did so on my machine with the
patch applied, but they appear nowhere in your logs :(

paginig_init() in arch/sparc64/mm/init.c calls lmb_analyze() them lmb_dump_all().

Those messages go out with KERN_DEBUG log level, maybe messages at
that level were trimmed by your log capture for some reason?

In any event I think I know the area that's cause some kind of problem.
It looks like lmb_alloc() has a case where it will reserve the wrong
amount of memory, or something like that.

You can remove the debugging patch I sent you, and try this one instead.
Please make sure KERN_DEBUG messages make it into the log :-)

diff --git a/arch/sparc64/mm/init.c b/arch/sparc64/mm/init.c
index a9828d7..a628a99 100644
--- a/arch/sparc64/mm/init.c
+++ b/arch/sparc64/mm/init.c
@@ -1353,6 +1353,8 @@ static void __init bootmem_init_one_node(int nid)

numadbg("bootmem_init_one_node(%d)\n", nid);

+ lmb_dump_all();
+
p = NODE_DATA(nid);

if (p->node_spanned_pages) {
diff --git a/lib/lmb.c b/lib/lmb.c
index 83287d3..d8c84f3 100644
--- a/lib/lmb.c
+++ b/lib/lmb.c
@@ -17,6 +17,8 @@

#define LMB_ALLOC_ANYWHERE 0

+#define DEBUG
+
struct lmb lmb;

void lmb_dump_all(void)
@@ -29,7 +31,7 @@ void lmb_dump_all(void)
pr_debug(" memory.size = 0x%llx\n",
(unsigned long long)lmb.memory.size);
for (i=0; i < lmb.memory.cnt ;i++) {
- pr_debug(" memory.region[0x%x].base = 0x%llx\n",
+ pr_debug(" memory.region[0x%lx].base = 0x%llx\n",
i, (unsigned long long)lmb.memory.region[i].base);
pr_debug(" .size = 0x%llx\n",
(unsigned long long)lmb.memory.region[i].size);
@@ -38,7 +40,7 @@ void lmb_dump_all(void)
pr_debug(" reserved.cnt = 0x%lx\n", lmb.reserved.cnt);
pr_debug(" reserved.size = 0x%lx\n", lmb.reserved.size);
for (i=0; i < lmb.reserved.cnt ;i++) {
- pr_debug(" reserved.region[0x%x].base = 0x%llx\n",
+ pr_debug(" reserved.region[0x%lx].base = 0x%llx\n",
i, (unsigned long long)lmb.reserved.region[i].base);
pr_debug(" .size = 0x%llx\n",
(unsigned long long)lmb.reserved.region[i].size);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/