Re: Regression on ARMs in next-20170531

From: Tony Lindgren
Date: Wed May 31 2017 - 14:03:35 EST


* Russell King - ARM Linux <linux@xxxxxxxxxxxxxxx> [170531 10:47]:
> I don't have a similarly configured kernel, but here I have for the
> start of this function:
>
> 00000680 <__mod_node_page_state>:
> 680: e1a0c00d mov ip, sp
> 684: e92dd870 push {r4, r5, r6, fp, ip, lr, pc}
> 688: e24cb004 sub fp, ip, #4
> 68c: e590cc00 ldr ip, [r0, #3072] ; 0xc00
> 690: e1a0400d mov r4, sp
> 694: ee1d6f90 mrc 15, 0, r6, cr13, cr0, {4}
> 698: e08c5001 add r5, ip, r1
> 69c: e2855001 add r5, r5, #1
> 6a0: e1a03005 mov r3, r5
> 6a4: e196c0dc ldrsb ip, [r6, ip]
> 6a8: e19630d3 ldrsb r3, [r6, r3]
>
> r5 in your code is the equivalent of r6, r4 => r3, r3 -> r5.
> lr is the __per_cpu_offset array, so the first instruction is
> trying to load the percpu offset.
>
> The faulting code is:
>
> x = delta + __this_cpu_read(*p);
>
> specifically "__this_cpu_read(*p)".
>
> "ip" holds "pcp" from:
>
> struct per_cpu_nodestat __percpu *pcp = pgdat->per_cpu_nodestats;
>
> and you may notice that it's zero in the register dump. So,
> pgdat->per_cpu_nodestats is NULL here.
>
> This seems to be setup in setup_per_cpu_pageset(), which in the init
> order, happens way after mm_init() (which contains kmem_cache_init()).

OK thanks, so that should help :)

> So, looks to me like an init ordering bug. I'm not sure why SMP
> would be working - maybe its only working because it's managing to
> scribble over some memory that isn't faulting? I suspect a
> WARN_ON(!pcp) here will report even on SMP.

The other way around, CONFIG_SMP=y is not booting, disabling
it boots.

Regards,

Tony