Re: [PATCH 3/6] x86: Minimize SRAT messages

From: Mike Travis
Date: Wed Feb 23 2011 - 16:24:24 EST


[I was finally able to get some time on our large UV test system.]

Here are the stats testing on a system with 248 nodes, 606 EFI
mem ranges, 1984 cores

after get_log_buff_early: (17% overflow)

[ 0.000000] early log_buf free: -45723/262183(-17%)
[ 0.000000] first line: : mem339: type=7, attr=0xf, range=[0x00000e6000000000-0x00000e6fff000000) (6552

Here I enabled some cores that were disabled so now the system
has 248 nodes, 606 EFI mem ranges, 2368 cores.

after minimize-time-zero-msgs: (5% overflow)

[0] early log_buf free: -15184/262172(-5%)
[0] first line: [0x000000007226e000-0x0000000072271000) (0MB) <6>[0] EFI: mem67: type=3, attr=0

Condensing the SRAT: PXM APIC messages resulted in 26% space free
in the early log buffer...

Was 2368 lines (for 2368 cores):

779 [0] SRAT: PXM 0 -> APIC 0x0000 -> Node 0
780 [0] SRAT: PXM 0 -> APIC 0x0002 -> Node 0
781 [0] SRAT: PXM 0 -> APIC 0x0004 -> Node 0
...
3145 [0] SRAT: PXM 247 -> APIC 0x3df0 -> Node 247
3146 [0] SRAT: PXM 247 -> APIC 0x3df2 -> Node 247

Now it's 248 lines (for 248 Nodes) (Nodes 0..191 have 10 core cpus.)

777 [0] SRAT: Node 0: PXM:APIC 0:0 :2 :4 :16 :18 :32 :34 :36 :48 :50
778 [0] SRAT: Node 1: PXM:APIC 1:64 :66 :68 :80 :82 :96 :98 :100 :112 :114
779 [0] SRAT: Node 2: PXM:APIC 2:128 :130 :132 :144 :146 :160 :162 :164 :176 :178
...
968 [0] SRAT: Node 190: PXM:APIC 190:12160 :12162 :12164 :12176 :12178 :12192 :12194 :12196 :12208 :12210
969 [0] SRAT: Node 191: PXM:APIC 191:12224 :12226 :12228 :12240 :12242 :12256 :12258 :12260 :12272 :12274
...
1023 [0] SRAT: Node 246: PXM:APIC 246:15744 :15746 :15748 :15760 :15778 :15780 :15792 :15794
1024 [0] SRAT: Node 247: PXM:APIC 247:15808 :15810 :15812 :15826 :15840 :15844 :15856 :15858

[0] early log_buf free: 69649/192523(26%)
[0] first line: <6>[0] Initializing cgroup subsys cpuset <6>[0] Initializing cgroup subsys cpu

My question is is the above decimal satisfactory, or should it be hex as
shown below? (Which will add 8k bytes for the "0x" when there are 4096 cores
but the hex values will be smaller.)

821 [0] SRAT: Node 0: PXM:APIC 0:0x0 :0x2 :0x4 :0x10 :0x12 :0x20 :0x22 :0x24 :0x30 :0x32
822 [0] SRAT: Node 1: PXM:APIC 1:0x40 :0x42 :0x44 :0x50 :0x52 :0x60 :0x62 :0x64 :0x70 :0x72
823 [0] SRAT: Node 2: PXM:APIC 2:0x80 :0x82 :0x84 :0x90 :0x92 :0xa0 :0xa2 :0xa4 :0xb0 :0xb2
...
1011 [0] SRAT: Node 190: PXM:APIC 190:0x2f80 :0x2f82 :0x2f84 :0x2f90 :0x2f92 :0x2fa0 :0x2fa2 :0x2fa4 :0x2fb0 :0x2fb2
1012 [0] SRAT: Node 191: PXM:APIC 191:0x2fc0 :0x2fc2 :0x2fc4 :0x2fd0 :0x2fd2 :0x2fe0 :0x2fe2 :0x2fe4 :0x2ff0 :0x2ff2
...
1067 [0] SRAT: Node 246: PXM:APIC 246:0x3d80 :0x3d82 :0x3d84 :0x3d90 :0x3da2 :0x3da4 :0x3db0 :0x3db2
1068 [0] SRAT: Node 247: PXM:APIC 247:0x3dc0 :0x3dc2 :0x3dc4 :0x3dd2 :0x3de0 :0x3de4 :0x3df0 :0x3df2

I will do some more study to see if affecting only these changes will
be enough to not overflow the early log buffer in a max config system.

(Btw, I have not figured out how to predict ahead of time that this
APIC id is the last one on the Node to insert the '\n'.)

Thanks,
Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/