Re: power9 NUMA crash while reading debugfs imc_cmd
From: Michael Ellerman
Date: Sat Jun 29 2019 - 07:29:20 EST
Qian Cai <cai@xxxxxx> writes:
> On Fri, 2019-06-28 at 17:19 +0530, Anju T Sudhakar wrote:
>> On 6/28/19 9:04 AM, Qian Cai wrote:
>> >
>> > > On Jun 27, 2019, at 11:12 PM, Michael Ellerman <mpe@xxxxxxxxxxxxxx> wrote:
>> > >
>> > > Qian Cai <cai@xxxxxx> writes:
>> > > > Read of debugfs imc_cmd file for a memory-less node will trigger a crash
>> > > > below
>> > > > on this power9 machine which has the following NUMA layout.
>> > >
>> > > What type of machine is it?
>> >
>> > description: PowerNV
>> > product: 8335-GTH (ibm,witherspoon)
>> > vendor: IBM
>> > width: 64 bits
>> > capabilities: smp powernv opal
>>
>>
>> Hi Qian Cai,
>>
>> Could you please try with this patch:Â
>> https://lists.ozlabs.org/pipermail/linuxppc-dev/2019-June/192803.html
>>
>> and see if the issue is resolved?
>
> It works fine.
>
> Just feel a bit silly that a node without CPU and memory is still online by
> default during boot at the first place on powerpc, but that is probably a
> different issue. For example,
Those are there to represent the memory on your attached GPUs. It's not
onlined by default.
I don't really love that they show up like that, but I think that's
working as expected.
cheers
> # numactl -H
> available: 6 nodes (0,8,252-255)
> node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
> 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
> 53 54 55 56 57 58 59 60 61 62 63
> node 0 size: 126801 MB
> node 0 free: 123199 MB
> node 8 cpus: 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85
> 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
> 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
> node 8 size: 130811 MB
> node 8 free: 128436 MB
> node 252 cpus:
> node 252 size: 0 MB
> node 252 free: 0 MB
> node 253 cpus:
> node 253 size: 0 MB
> node 253 free: 0 MB
> node 254 cpus:
> node 254 size: 0 MB
> node 254 free: 0 MB
> node 255 cpus:
> node 255 size: 0 MB
> node 255 free: 0 MB
> node distances:
> nodeÂÂÂ0ÂÂÂ8ÂÂ252ÂÂ253ÂÂ254ÂÂ255Â
> Â 0:ÂÂ10ÂÂ40ÂÂ80ÂÂ80ÂÂ80ÂÂ80Â
> Â 8:ÂÂ40ÂÂ10ÂÂ80ÂÂ80ÂÂ80ÂÂ80Â
> Â252:ÂÂ80ÂÂ80ÂÂ10ÂÂ80ÂÂ80ÂÂ80Â
> Â253:ÂÂ80ÂÂ80ÂÂ80ÂÂ10ÂÂ80ÂÂ80Â
> Â254:ÂÂ80ÂÂ80ÂÂ80ÂÂ80ÂÂ10ÂÂ80Â
> Â255:ÂÂ80ÂÂ80ÂÂ80ÂÂ80ÂÂ80ÂÂ10Â
>
> # cat /sys/devices/system/node/onlineÂ
> 0,8,252-255