Re: [PATCH 03/34] mm, vmscan: move LRU lists to node

From: Mel Gorman
Date: Fri Aug 05 2016 - 04:42:20 EST


On Thu, Aug 04, 2016 at 09:59:17PM +0100, James Hogan wrote:
> > Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> > Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> > Acked-by: Vlastimil Babka <vbabka@xxxxxxx>
>
> This breaks boot on metag architecture:
> Oops: err 0007 (Data access general read/write fault) addr 00233008 [#1]
>
> It appears to be in node_page_state_snapshot() (via
> pgdat_reclaimable()), and have come via mm_init. Here's the relevant
> bit of the backtrace:
>
> node_page_state_snapshot@0x4009c884(enum node_stat_item item =
> ???, struct pglist_data * pgdat = ???) + 0x48
> pgdat_reclaimable(struct pglist_data * pgdat = 0x402517a0)
> show_free_areas(unsigned int filter = 0) + 0x2cc
> show_mem(unsigned int filter = 0) + 0x18
> mm_init@0x4025c3d4()
> start_kernel() + 0x204
>
> __per_cpu_offset[0] == 0x233000 (close to bad addr),
> pgdat->per_cpu_nodestats = NULL. and setup_per_cpu_pageset()
> definitely hasn't been called yet (mm_init is called before
> setup_per_cpu_pageset()).
>
> Any ideas what the correct solution is (and why presumably others
> haven't seen the same issue on other architectures?).
>

metag calls show_mem in mem_init() before the pagesets are initialised.
What's surprising is that it worked for the zone stats as it appears
that calling zone_reclaimable() from that context should also have
broken. Did anything change recently that would have avoided the
zone->pageset dereference in zone_reclaimable() before?

The easiest option would be to not call show_mem from arch code until
after the pagesets are setup.

--
Mel Gorman
SUSE Labs