Re: [PATCH 1/3] powerpc/numa: Set numa_node for all possible cpus
From: Srikar Dronamraju
Date: Thu Mar 12 2020 - 09:20:11 EST
* Vlastimil Babka <vbabka@xxxxxxx> [2020-03-12 10:30:50]:
> On 3/12/20 9:23 AM, Sachin Sant wrote:
> >> On 12-Mar-2020, at 10:57 AM, Srikar Dronamraju <srikar@xxxxxxxxxxxxxxxxxx> wrote:
> >> * Michal Hocko <mhocko@xxxxxxxxxx> [2020-03-11 12:57:35]:
> >>> On Wed 11-03-20 16:32:35, Srikar Dronamraju wrote:
> >>>> To ensure a cpuless, memoryless dummy node is not online, powerpc need
> >>>> to make sure all possible but not present cpu_to_node are set to a
> >>>> proper node.
> >>>
> >>> Just curious, is this somehow related to
> >>> http://lkml.kernel.org/r/20200227182650.GG3771@xxxxxxxxxxxxxx?
> >>>
> >>
> >> The issue I am trying to fix is a known issue in Powerpc since many years.
> >> So this surely not a problem after a75056fc1e7c (mm/memcontrol.c: allocate
> >> shrinker_map on appropriate NUMA node").
> >>
> >> I tried v5.6-rc4 + a75056fc1e7c but didnt face any issues booting the
> >> kernel. Will work with Sachin/Abdul (reporters of the issue).
I had used v1 and not v2. So my mistake.
> > I applied this 3 patch series on top of March 11 next tree (commit d44a64766795 )
> > The kernel still fails to boot with same call trace.
>
While I am not an expert in the slub area, I looked at the patch
a75056fc1e7c and had some thoughts on why this could be causing this issue.
On the system where the crash happens, the possible number of nodes is much
greater than the number of onlined nodes. The pdgat or the NODE_DATA is only
available for onlined nodes.
With a75056fc1e7c memcg_alloc_shrinker_maps, we end up calling kzalloc_node
for all possible nodes and in ___slab_alloc we end up looking at the
node_present_pages which is NODE_DATA(nid)->node_present_pages.
i.e for a node whose pdgat struct is not allocated, we are trying to
dereference.
Also for a memoryless/cpuless node or possible but not present nodes,
node_to_mem_node(node) will still end up as node (atleast on powerpc).
I tried with this hunk below and it works.
But I am not sure if we need to check at other places were
node_present_pages is being called.
diff --git a/mm/slub.c b/mm/slub.c
index 626cbcbd977f..bddb93bed55e 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2571,9 +2571,13 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
if (unlikely(!node_match(page, node))) {
int searchnode = node;
- if (node != NUMA_NO_NODE && !node_present_pages(node))
- searchnode = node_to_mem_node(node);
-
+ if (node != NUMA_NO_NODE) {
+ if (!node_online(node) || !node_present_pages(node)) {
+ searchnode = node_to_mem_node(node);
+ if (!node_online(searchnode))
+ searchnode = first_online_node;
+ }
+ }
if (unlikely(!node_match(page, searchnode))) {
stat(s, ALLOC_NODE_MISMATCH);
deactivate_slab(s, page, c->freelist, c);
> >
>
--
Thanks and Regards
Srikar Dronamraju