[PATCH 4/5] x86: fix system without memory on node0 -v2

From: Yinghai Lu
Date: Thu May 14 2009 - 12:43:53 EST



Jack found that crash with doesn't have memory on node0.

it turns out with per_cpu changeset, node_number for BSP will be alway 0,
and it is not consistent to cpu_to_node() that is to near node already.
aka when numa_set_node() for node0 is called early before per_cpu area is
setup

two places touched that per_cpu(node_number,):
1. in cpu/common.c::cpu_init() and it is not for BP
| #ifdef CONFIG_NUMA
| if (cpu != 0 && percpu_read(node_number) == 0 &&
| cpu_to_node(cpu) != NUMA_NO_NODE)
| percpu_write(node_number, cpu_to_node(cpu));
| #endif
for BP: traps_init ==> cpu_init
for AP: start_secondary ==> cpu_init

2. cpu/intel.c or amd.c::srat_detect_node via numa_set_node()
for BP: check_bugs ==> identify_boot_cpu ==> identify_cpu()
that is rather later before numa_node_id() is used for BP...
for AP: start_secondary=>smp_callin=>smp_store_cpu_info()=>identify_secondary_cpu ==> identify_cpu()

so only try to set that for BP more early in setup_per_cpu_areas, and
don't bother set that for APs there (it will be updated later and used later)
(and don't mess the 0 before the copying BP per_cpu data to APs)

v2: updated changelog with detailed reason

[ Impact: fix crashing on memoryless node 0]

Reported-and-tested-by: Jack Steiner <steiner@xxxxxxx>
Signed-off-by: Yinghai Lu <yinghai@xxxxxxxxxx>

---
arch/x86/kernel/setup_percpu.c | 8 ++++++++
1 file changed, 8 insertions(+)

Index: linux-2.6/arch/x86/kernel/setup_percpu.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/setup_percpu.c
+++ linux-2.6/arch/x86/kernel/setup_percpu.c
@@ -423,6 +423,14 @@ void __init setup_per_cpu_areas(void)
early_per_cpu_ptr(x86_cpu_to_node_map) = NULL;
#endif

+#if defined(CONFIG_X86_64) && defined(CONFIG_NUMA)
+ /*
+ * make sure boot cpu node_number is right, when boot cpu is on the
+ * node that doesn't have mem installed
+ */
+ per_cpu(node_number, boot_cpu_id) = cpu_to_node(boot_cpu_id);
+#endif
+
/* Setup node to cpumask map */
setup_node_to_cpumask_map();

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/