David,
On 19.09.16 11:49:30, David Daney wrote:
Fix by supplying a cpu_to_node() implementation that returns correct
node mappings.
+int cpu_to_node(int cpu)
+{
+ int nid;
+
+ /*
+ * Return 0 for unknown mapping so that we report something
+ * sensible if firmware doesn't supply a proper mapping.
+ */
+ if (cpu < 0 || cpu >= NR_CPUS)
+ return 0;
+
+ nid = cpu_to_node_map[cpu];
+ if (nid == NUMA_NO_NODE)
+ nid = 0;
+ return nid;
+}
+EXPORT_SYMBOL(cpu_to_node);
this implementation fixes the per-cpu workqueue initialization, but I
don't think a cpu_to_node() implementation private to arm64 is the
proper solution.
Apart from better using generic code, the cpu_to_node() function is
called in the kernel's fast path. I think your implementation is too
expensive and also does not consider per-cpu data access for the
lookup as the generic code does. Secondly, numa_off is not considered
at all.
Instead we need to make sure the set_*numa_node() functions are called
earlier before secondary cpus are booted. My suggested change for that
is this:
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index d93d43352504..952365c2f100 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -204,7 +204,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
static void smp_store_cpu_info(unsigned int cpuid)
{
store_cpu_topology(cpuid);
- numa_store_cpu_info(cpuid);
}
/*
@@ -719,6 +718,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
continue;
set_cpu_present(cpu, true);
+ numa_store_cpu_info(cpu);
}
}
I have tested the code and it properly sets up all per-cpu workqueues.
Unfortunately either your nor my code does fix the BUG_ON() I see with
the numa kernel:
kernel BUG at mm/page_alloc.c:1848!
See below for the core dump. It looks like this happens due to moving
a mem block where first and last page are mapped to different numa
nodes, thus, triggering the BUG_ON().