Re: [RFC PATCH v2 0/4] Eliminate zone->lock contention for will-it-scale/page_fault1 and parallel free
From: Daniel Jordan
Date: Thu Mar 29 2018 - 15:21:33 EST
On 03/20/2018 04:54 AM, Aaron Lu wrote:
This series is meant to improve zone->lock scalability for order 0 pages.
With will-it-scale/page_fault1 workload, on a 2 sockets Intel Skylake
server with 112 CPUs, CPU spend 80% of its time spinning on zone->lock.
Perf profile shows the most time consuming part under zone->lock is the
cache miss on "struct page", so here I'm trying to avoid those cache
misses.
I ran page_fault1 comparing 4.16-rc5 to your recent work, these four
patches plus the three others from your github branch zone_lock_rfc_v2.
Out of curiosity I also threw in another 4.16-rc5 with the pcp batch
size adjusted so high (10922 pages) that we always stay in the pcp lists
and out of buddy completely. I used your patch[*] in this last kernel.
This was on a 2-socket, 20-core broadwell server.
There were some small regressions a bit outside the noise at low process
counts (2-5) but I'm not sure they're repeatable. Anyway, it does
improve the microbenchmark across the board.
[*] lkml.kernel.org/r/20170919072342.GB7263 () intel ! com
Attachment:
gnuplot-pgfs-vs-ntask-iter1.png
Description: PNG image
,,586305.0,747,587731.0,1766
4.0,3.4,609505.0,1563,608007.0,1170
8.0,5.9,633145.0,1752,622690.0,1287
,,1131428.0,7397,1022890.0,7334
-1.0,-0.3,1119974.0,2558,1020102.0,5707
3.0,3.3,1165004.0,6232,1056689.0,6411
,,1590413.0,6412,1346900.0,8925
-0.7,1.0,1579816.0,7217,1360376.0,4418
2.3,3.4,1626925.0,5321,1392515.0,8180
,,2064476.0,5035,1656475.0,14714
-1.0,0.9,2043240.0,9070,1672036.0,5342
1.2,3.9,2089959.0,8287,1721614.0,7797
,,2486090.0,11178,1878085.0,15286
-0.1,1.1,2483021.0,15100,1898459.0,9295
1.3,4.0,2517602.0,13717,1952995.0,7481
,,2869756.0,9194,2058398.0,20580
0.4,3.5,2882220.0,20584,2129444.0,9689
2.4,6.0,2937618.0,14126,2182859.0,9650
,,3242589.0,15354,2231977.0,20188
0.9,3.2,3270796.0,15125,2303780.0,6607
2.0,6.7,3306528.0,17683,2381279.0,16507
,,3598209.0,10765,2361819.0,13509
1.2,4.5,3642407.0,14894,2469191.0,16250
2.0,8.1,3671834.0,17501,2552786.0,12112
,,3974345.0,12605,2511565.0,31986
2.3,4.8,4067553.0,11070,2632608.0,8158
2.5,9.8,4073111.0,12464,2758075.0,31433
,,4333026.0,12187,2636914.0,15065
2.4,5.8,4435852.0,21692,2789949.0,16400
3.2,10.3,4470666.0,23663,2907263.0,15052
,,4932423.0,12184,2675769.0,23925
2.7,3.6,5064666.0,18600,2771476.0,22438
3.6,7.9,5110434.0,21460,2888419.0,18181
,,5461255.0,14704,2631232.0,24390
1.7,2.5,5554957.0,20979,2697554.0,19370
3.1,5.8,5629143.0,22781,2782902.0,20347
,,5924367.0,11835,2445607.0,25821
1.4,4.9,6004723.0,19071,2566547.0,26031
2.9,8.4,6094087.0,17676,2651793.0,20051
,,6381611.0,16792,2277611.0,39558
1.2,4.4,6459693.0,18869,2377795.0,25094
2.4,12.4,6534837.0,24991,2560085.0,12638
,,6804737.0,13654,2232121.0,19409
1.1,4.8,6881730.0,18995,2338868.0,18923
2.4,12.5,6970318.0,28677,2510594.0,25891
,,7197145.0,17500,2313168.0,16694
1.2,4.5,7287072.0,28727,2418232.0,23120
2.6,7.6,7383613.0,17992,2489382.0,28385
,,7550498.0,15101,2226427.0,24769
1.6,4.4,7667641.0,24306,2324675.0,16265
2.8,5.3,7761917.0,29855,2345195.0,20660
,,7902794.0,12579,2188399.0,37454
1.7,6.0,8033876.0,21158,2320641.0,13053
2.8,8.9,8126732.0,27620,2383083.0,25173
,,8277506.0,15448,2198021.0,33075
2.1,6.4,8453255.0,17411,2339395.0,20221
3.0,7.7,8529130.0,29853,2366800.0,20139
,,8651034.0,19706,2239626.0,25694
2.2,3.2,8840988.0,24387,2311966.0,25694
3.1,9.3,8918721.0,34762,2448589.0,26228
,,8777023.0,11833,2259622.0,32155
2.5,5.7,8993348.0,25481,2389000.0,29464
3.5,6.3,9085319.0,27791,2401713.0,28706
,,8855202.0,27455,2268030.0,35914
3.0,4.1,9123705.0,34289,2361876.0,31917
4.2,11.9,9228843.0,33484,2536867.0,26001
,,8952897.0,21601,2280539.0,30530
3.5,10.1,9268365.0,30804,2510883.0,26312
4.6,10.3,9367048.0,30681,2514740.0,32897
,,9036582.0,19483,2374728.0,42892
3.7,2.2,9369541.0,33253,2425993.0,45953
5.0,4.5,9489640.0,28066,2482179.0,33821
,,9136041.0,18233,2336090.0,30037
4.0,6.1,9497501.0,34409,2478602.0,31960
4.7,10.5,9563832.0,34507,2581056.0,38516
,,9226630.0,17998,2326070.0,33782
3.7,7.8,9570547.0,27052,2508396.0,41848
4.4,10.4,9634192.0,40218,2566842.0,31116
,,9305784.0,24574,2391261.0,29548
3.8,6.1,9656252.0,39164,2536624.0,45738
4.5,5.2,9720210.0,31172,2516447.0,32348
,,9381004.0,19378,2442774.0,35745
2.6,1.3,9626125.0,65187,2474560.0,29978
4.1,4.4,9766045.0,54298,2549227.0,31200
,,9401844.0,27746,2456372.0,40550
2.7,3.8,9652161.0,51629,2549004.0,39991
3.5,6.9,9731681.0,51852,2625589.0,27822
,,9428320.0,17562,2509119.0,39752
2.1,6.1,9630472.0,50106,2662447.0,44347
3.1,5.7,9722152.0,50349,2651891.0,28519
,,9561062.0,21910,2392883.0,24181
2.0,7.1,9755774.0,60382,2563573.0,32132
3.5,15.0,9894735.0,45967,2752506.0,28517
,,9624859.0,30462,2480667.0,27055
2.7,5.5,9883943.0,61851,2618326.0,36656
4.5,15.4,10057320.0,46352,2863788.0,34022
,,9739896.0,35436,2476666.0,30301
3.1,8.2,10043706.0,60944,2680570.0,42346
4.6,16.8,10191082.0,51348,2893385.0,32717
,,9833955.0,39366,2628480.0,36567
3.5,2.7,10180871.0,50050,2699941.0,42805
5.0,7.3,10323136.0,50768,2820628.0,30552
,,9908832.0,20826,2666415.0,51379
3.5,0.5,10251385.0,58551,2679925.0,49144
5.1,5.7,10418155.0,51726,2817192.0,34043
,,9969311.0,20378,2563399.0,36720
3.5,4.8,10314449.0,60867,2686176.0,42926
5.4,9.1,10504881.0,53101,2796816.0,37461
,,10077169.0,36182,2584728.0,32672
3.1,7.4,10393453.0,63048,2775523.0,39745
4.7,11.9,10549870.0,45281,2893000.0,39102
,,10115997.0,25835,2653036.0,33259
2.7,5.7,10388901.0,63402,2803290.0,36021
4.6,11.2,10580796.0,63517,2949834.0,31422
,,10162757.0,33119,2681195.0,30592
2.5,3.0,10413010.0,76720,2761752.0,32472
4.0,9.5,10568061.0,65614,2935127.0,38463
,,10223472.0,41882,2670421.0,26049
2.4,5.0,10470977.0,58009,2803478.0,37111
4.1,7.4,10646450.0,54810,2868986.0,52724
kernel (#) ntask proc thr proc stdev thr stdev
speedup speedup pgf/s pgf/s
4.16-rc5 (1) 1 586,305 747 587,731 1,766
lu-zone (2) 1 4.0% 3.4% 609,505 1,562 608,007 1,169
4.16-rc5-nz (3) 1 8.0% 5.9% 633,145 1,752 622,690 1,286
4.16-rc5 (1) 2 1,131,428 7,396 1,022,890 7,333
lu-zone (2) 2 -1.0% -0.3% 1,119,974 2,557 1,020,102 5,706
4.16-rc5-nz (3) 2 3.0% 3.3% 1,165,004 6,232 1,056,689 6,411
4.16-rc5 (1) 3 1,590,413 6,411 1,346,900 8,924
lu-zone (2) 3 -0.7% 1.0% 1,579,816 7,216 1,360,376 4,418
4.16-rc5-nz (3) 3 2.3% 3.4% 1,626,925 5,321 1,392,515 8,180
4.16-rc5 (1) 4 2,064,476 5,034 1,656,475 14,713
lu-zone (2) 4 -1.0% 0.9% 2,043,240 9,069 1,672,036 5,342
4.16-rc5-nz (3) 4 1.2% 3.9% 2,089,959 8,287 1,721,614 7,796
4.16-rc5 (1) 5 2,486,090 11,178 1,878,085 15,286
lu-zone (2) 5 -0.1% 1.1% 2,483,021 15,100 1,898,459 9,295
4.16-rc5-nz (3) 5 1.3% 4.0% 2,517,602 13,717 1,952,995 7,481
4.16-rc5 (1) 6 2,869,756 9,194 2,058,398 20,580
lu-zone (2) 6 0.4% 3.5% 2,882,220 20,583 2,129,444 9,689
4.16-rc5-nz (3) 6 2.4% 6.0% 2,937,618 14,126 2,182,859 9,650
4.16-rc5 (1) 7 3,242,589 15,354 2,231,977 20,188
lu-zone (2) 7 0.9% 3.2% 3,270,796 15,124 2,303,780 6,607
4.16-rc5-nz (3) 7 2.0% 6.7% 3,306,528 17,683 2,381,279 16,507
4.16-rc5 (1) 8 3,598,209 10,764 2,361,819 13,508
lu-zone (2) 8 1.2% 4.5% 3,642,407 14,893 2,469,191 16,250
4.16-rc5-nz (3) 8 2.0% 8.1% 3,671,834 17,501 2,552,786 12,112
4.16-rc5 (1) 9 3,974,345 12,605 2,511,565 31,986
lu-zone (2) 9 2.3% 4.8% 4,067,553 11,069 2,632,608 8,158
4.16-rc5-nz (3) 9 2.5% 9.8% 4,073,111 12,463 2,758,075 31,432
4.16-rc5 (1) 10 4,333,026 12,187 2,636,914 15,064
lu-zone (2) 10 2.4% 5.8% 4,435,852 21,691 2,789,949 16,399
4.16-rc5-nz (3) 10 3.2% 10.3% 4,470,666 23,663 2,907,263 15,052
4.16-rc5 (1) 11 4,932,423 12,183 2,675,769 23,924
lu-zone (2) 11 2.7% 3.6% 5,064,666 18,600 2,771,476 22,438
4.16-rc5-nz (3) 11 3.6% 7.9% 5,110,434 21,459 2,888,419 18,180
4.16-rc5 (1) 12 5,461,255 14,704 2,631,232 24,390
lu-zone (2) 12 1.7% 2.5% 5,554,957 20,978 2,697,554 19,369
4.16-rc5-nz (3) 12 3.1% 5.8% 5,629,143 22,781 2,782,902 20,346
4.16-rc5 (1) 13 5,924,367 11,835 2,445,607 25,821
lu-zone (2) 13 1.4% 4.9% 6,004,723 19,070 2,566,547 26,031
4.16-rc5-nz (3) 13 2.9% 8.4% 6,094,087 17,676 2,651,793 20,050
4.16-rc5 (1) 14 6,381,611 16,791 2,277,611 39,557
lu-zone (2) 14 1.2% 4.4% 6,459,693 18,869 2,377,795 25,093
4.16-rc5-nz (3) 14 2.4% 12.4% 6,534,837 24,990 2,560,085 12,638
4.16-rc5 (1) 15 6,804,737 13,653 2,232,121 19,408
lu-zone (2) 15 1.1% 4.8% 6,881,730 18,995 2,338,868 18,922
4.16-rc5-nz (3) 15 2.4% 12.5% 6,970,318 28,677 2,510,594 25,890
4.16-rc5 (1) 16 7,197,145 17,499 2,313,168 16,694
lu-zone (2) 16 1.2% 4.5% 7,287,072 28,727 2,418,232 23,120
4.16-rc5-nz (3) 16 2.6% 7.6% 7,383,613 17,991 2,489,382 28,385
4.16-rc5 (1) 17 7,550,498 15,101 2,226,427 24,768
lu-zone (2) 17 1.6% 4.4% 7,667,641 24,305 2,324,675 16,265
4.16-rc5-nz (3) 17 2.8% 5.3% 7,761,917 29,854 2,345,195 20,659
4.16-rc5 (1) 18 7,902,794 12,578 2,188,399 37,453
lu-zone (2) 18 1.7% 6.0% 8,033,876 21,158 2,320,641 13,053
4.16-rc5-nz (3) 18 2.8% 8.9% 8,126,732 27,619 2,383,083 25,172
4.16-rc5 (1) 19 8,277,506 15,448 2,198,021 33,074
lu-zone (2) 19 2.1% 6.4% 8,453,255 17,411 2,339,395 20,220
4.16-rc5-nz (3) 19 3.0% 7.7% 8,529,130 29,852 2,366,800 20,139
4.16-rc5 (1) 20 8,651,034 19,705 2,239,626 25,694
lu-zone (2) 20 2.2% 3.2% 8,840,988 24,387 2,311,966 25,693
4.16-rc5-nz (3) 20 3.1% 9.3% 8,918,721 34,761 2,448,589 26,227
4.16-rc5 (1) 21 8,777,023 11,833 2,259,622 32,154
lu-zone (2) 21 2.5% 5.7% 8,993,348 25,480 2,389,000 29,464
4.16-rc5-nz (3) 21 3.5% 6.3% 9,085,319 27,790 2,401,713 28,706
4.16-rc5 (1) 22 8,855,202 27,455 2,268,030 35,914
lu-zone (2) 22 3.0% 4.1% 9,123,705 34,288 2,361,876 31,917
4.16-rc5-nz (3) 22 4.2% 11.9% 9,228,843 33,483 2,536,867 26,000
4.16-rc5 (1) 23 8,952,897 21,601 2,280,539 30,530
lu-zone (2) 23 3.5% 10.1% 9,268,365 30,803 2,510,883 26,312
4.16-rc5-nz (3) 23 4.6% 10.3% 9,367,048 30,681 2,514,740 32,896
4.16-rc5 (1) 24 9,036,582 19,482 2,374,728 42,891
lu-zone (2) 24 3.7% 2.2% 9,369,541 33,253 2,425,993 45,952
4.16-rc5-nz (3) 24 5.0% 4.5% 9,489,640 28,066 2,482,179 33,820
4.16-rc5 (1) 25 9,136,041 18,232 2,336,090 30,036
lu-zone (2) 25 4.0% 6.1% 9,497,501 34,408 2,478,602 31,959
4.16-rc5-nz (3) 25 4.7% 10.5% 9,563,832 34,506 2,581,056 38,516
4.16-rc5 (1) 26 9,226,630 17,998 2,326,070 33,782
lu-zone (2) 26 3.7% 7.8% 9,570,547 27,052 2,508,396 41,848
4.16-rc5-nz (3) 26 4.4% 10.4% 9,634,192 40,217 2,566,842 31,115
4.16-rc5 (1) 27 9,305,784 24,573 2,391,261 29,547
lu-zone (2) 27 3.8% 6.1% 9,656,252 39,164 2,536,624 45,738
4.16-rc5-nz (3) 27 4.5% 5.2% 9,720,210 31,171 2,516,447 32,347
4.16-rc5 (1) 28 9,381,004 19,377 2,442,774 35,745
lu-zone (2) 28 2.6% 1.3% 9,626,125 65,187 2,474,560 29,977
4.16-rc5-nz (3) 28 4.1% 4.4% 9,766,045 54,298 2,549,227 31,199
4.16-rc5 (1) 29 9,401,844 27,746 2,456,372 40,549
lu-zone (2) 29 2.7% 3.8% 9,652,161 51,629 2,549,004 39,990
4.16-rc5-nz (3) 29 3.5% 6.9% 9,731,681 51,852 2,625,589 27,821
4.16-rc5 (1) 30 9,428,320 17,561 2,509,119 39,752
lu-zone (2) 30 2.1% 6.1% 9,630,472 50,106 2,662,447 44,347
4.16-rc5-nz (3) 30 3.1% 5.7% 9,722,152 50,348 2,651,891 28,518
4.16-rc5 (1) 31 9,561,062 21,909 2,392,883 24,180
lu-zone (2) 31 2.0% 7.1% 9,755,774 60,381 2,563,573 32,132
4.16-rc5-nz (3) 31 3.5% 15.0% 9,894,735 45,966 2,752,506 28,516
4.16-rc5 (1) 32 9,624,859 30,462 2,480,667 27,055
lu-zone (2) 32 2.7% 5.5% 9,883,943 61,850 2,618,326 36,655
4.16-rc5-nz (3) 32 4.5% 15.4% 10,057,320 46,352 2,863,788 34,021
4.16-rc5 (1) 33 9,739,896 35,435 2,476,666 30,301
lu-zone (2) 33 3.1% 8.2% 10,043,706 60,943 2,680,570 42,346
4.16-rc5-nz (3) 33 4.6% 16.8% 10,191,082 51,348 2,893,385 32,717
4.16-rc5 (1) 34 9,833,955 39,366 2,628,480 36,567
lu-zone (2) 34 3.5% 2.7% 10,180,871 50,050 2,699,941 42,804
4.16-rc5-nz (3) 34 5.0% 7.3% 10,323,136 50,767 2,820,628 30,551
4.16-rc5 (1) 35 9,908,832 20,826 2,666,415 51,379
lu-zone (2) 35 3.5% 0.5% 10,251,385 58,551 2,679,925 49,143
4.16-rc5-nz (3) 35 5.1% 5.7% 10,418,155 51,726 2,817,192 34,042
4.16-rc5 (1) 36 9,969,311 20,377 2,563,399 36,720
lu-zone (2) 36 3.5% 4.8% 10,314,449 60,867 2,686,176 42,925
4.16-rc5-nz (3) 36 5.4% 9.1% 10,504,881 53,100 2,796,816 37,461
4.16-rc5 (1) 37 10,077,169 36,182 2,584,728 32,672
lu-zone (2) 37 3.1% 7.4% 10,393,453 63,048 2,775,523 39,745
4.16-rc5-nz (3) 37 4.7% 11.9% 10,549,870 45,280 2,893,000 39,102
4.16-rc5 (1) 38 10,115,997 25,835 2,653,036 33,259
lu-zone (2) 38 2.7% 5.7% 10,388,901 63,402 2,803,290 36,020
4.16-rc5-nz (3) 38 4.6% 11.2% 10,580,796 63,516 2,949,834 31,421
4.16-rc5 (1) 39 10,162,757 33,118 2,681,195 30,591
lu-zone (2) 39 2.5% 3.0% 10,413,010 76,719 2,761,752 32,471
4.16-rc5-nz (3) 39 4.0% 9.5% 10,568,061 65,614 2,935,127 38,463
4.16-rc5 (1) 40 10,223,472 41,882 2,670,421 26,049
lu-zone (2) 40 2.4% 5.0% 10,470,977 58,008 2,803,478 37,111
4.16-rc5-nz (3) 40 4.1% 7.4% 10,646,450 54,810 2,868,986 52,724