[PATCH 3.18 10/50] Revert "powerpc/numa: Fix percpu allocations to be NUMA aware"

From: Greg Kroah-Hartman
Date: Fri Aug 04 2017 - 19:38:42 EST

3.18-stable review patch. If anyone has any objections, please let me know.


From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

This reverts commit 138bb14846a5856747694ae9ef565c9eb4533a1e which is
commit ba4a648f12f4cd0a8003dd229b6ca8a53348ee4b upstream.

Michal Hocko writes:

JFYI. We have encountered a regression after applying this patch on a
large ppc machine. While the patch is the right thing to do it doesn't
work well with the current vmalloc area size on ppc and large machines
where NUMA nodes are very far from each other. Just for the reference
the boot fails on such a machine with bunch of warning preceeding it.
See http://lkml.kernel.org/r/20170724134240.GL25221@xxxxxxxxxxxxxx

It seems the right thing to do is to enlarge the vmalloc space on ppc
but this is not the case in the upstream kernel yet AFAIK. It is also
questionable whether that is a stable material but I will decision on
you here.

We have reverted this patch from our 4.4 based kernel.

Newer kernels do not have enlarged vmalloc space yet AFAIK so they won't
work properly eiter. This bug is quite rare though because you need a
specific HW configuration to trigger the issue - namely NUMA nodes have
to be far away from each other in the physical memory space.

Cc: Michal Hocko <mhocko@xxxxxxxxxx>
Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
Cc: Nicholas Piggin <npiggin@xxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
arch/powerpc/include/asm/topology.h | 14 --------------
arch/powerpc/kernel/setup_64.c | 4 ++--
2 files changed, 2 insertions(+), 16 deletions(-)

--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -44,22 +44,8 @@ extern void __init dump_numa_cpu_topolog
extern int sysfs_add_device_to_node(struct device *dev, int nid);
extern void sysfs_remove_device_from_node(struct device *dev, int nid);

-static inline int early_cpu_to_node(int cpu)
- int nid;
- nid = numa_cpu_lookup_table[cpu];
- /*
- * Fall back to node 0 if nid is unset (it should be, except bugs).
- * This allows callers to safely do NODE_DATA(early_cpu_to_node(cpu)).
- */
- return (nid < 0) ? 0 : nid;

-static inline int early_cpu_to_node(int cpu) { return 0; }
static inline void dump_numa_cpu_topology(void) {}

static inline int sysfs_add_device_to_node(struct device *dev, int nid)
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -754,7 +754,7 @@ void ppc64_boot_msg(unsigned int src, co

static void * __init pcpu_fc_alloc(unsigned int cpu, size_t size, size_t align)
- return __alloc_bootmem_node(NODE_DATA(early_cpu_to_node(cpu)), size, align,
+ return __alloc_bootmem_node(NODE_DATA(cpu_to_node(cpu)), size, align,

@@ -765,7 +765,7 @@ static void __init pcpu_fc_free(void *pt

static int pcpu_cpu_distance(unsigned int from, unsigned int to)
- if (early_cpu_to_node(from) == early_cpu_to_node(to))
+ if (cpu_to_node(from) == cpu_to_node(to))