Re: [bisected] 051f3ca02e46 "Introduce NUMA identity node sched domain" breaks fake NUMA on s390

From: Heiko Carstens
Date: Mon May 14 2018 - 06:30:37 EST


On Mon, May 14, 2018 at 11:39:09AM +0200, Peter Zijlstra wrote:
> On Sat, May 12, 2018 at 12:02:33PM +0200, Heiko Carstens wrote:
> > Hello,
> >
> > Andre Wild reported that fake NUMA doesn't work on s390 anymore. Doesn't
> > work means it crashed for Andre, or it is in an endless loop within
> > init_sched_groups_capacity() for me (sg != sd->groups is always true).
> >
> > I could reproduce this with a very simple setup with only two nodes, where
> > each node has only one CPU. This allowed me to bisect it down to commit
> > 051f3ca02e46 ("sched/topology: Introduce NUMA identity node sched domain").
> >
> > With that commit reverted the system comes up again and the scheduling
> > domains look like this:
> >
> > [ 0.148592] smp: Bringing up secondary CPUs ...
> > [ 0.148984] smp: Brought up 2 nodes, 2 CPUs
> > [ 0.149097] CPU0 attaching sched-domain(s):
> > [ 0.149099] domain-0: span=0-1 level=NUMA
> > [ 0.149101] groups: 0:{ span=0 }, 1:{ span=1 }
> > [ 0.149106] CPU1 attaching sched-domain(s):
> > [ 0.149107] domain-0: span=0-1 level=NUMA
> > [ 0.149108] groups: 1:{ span=1 }, 0:{ span=0 }
> > [ 0.149111] span: 0-1 (max cpu_capacity = 1024)
> >
> > Any idea what's going wrong?
>
> Not yet; still trying to decipher your fake nume implementation.
>
> But meanwhile; could you provide me with:
>
> $ cat /sys/devices/system/node/node*/distance
> $ cat /sys/devices/system/node/node*/cpulist

Yes, of course:

$ cat /sys/devices/system/node/node0/distance
0 10
$ cat /sys/devices/system/node/node1/distance
10 0

$ cat /sys/devices/system/node/node0/cpulist
0
$ cat /sys/devices/system/node/node1/cpulist
1