[PATCH 0/2] Revert NUMA aspect of fair allocation policy

From: Mel Gorman
Date: Fri Dec 20 2013 - 09:54:21 EST


Johannes and I have been investigating the NUMA behaviour of the fair
zone allocation policy over the last week. It's still a work in progress
but we had at least agreed that 3.12 and 3.13 are looking bad from a NUMA
perspective and settled on a patch. We did not reach this agreement fast
enough and the wrong patch got merged. This series reverts the problematic
patch and submits the one we agreed upon.

I ran a few quick tests on the series.

3.13.0-rc4 vanilla
latest-v5r3 Latest Linus tree: commit f7556698
nofairnuma-v5r3 These two patches

TLDR, the key impact is to NUMA "miss" which is how many page allocations
that used remote memory and impacted performance. The series eliminates them

3.13.0-rc4 3.13.0-rc4 3.13.0-rc4
vanilla latest-v5r3 nofairnuma-v5r3
kernbench
NUMA alloc hit 73791065 74090594 93572436
NUMA alloc miss 20034206 19733525 0

vmr-stream
NUMA alloc hit 1354047 1536675 2105495
NUMA alloc miss 773033 581918 0

page fault microbencharm
NUMA alloc hit 187897428 187004773 265000639
NUMA alloc miss 77105377 77995654 0

ebizzy
NUMA alloc hit 232222122 248319262 315782797
NUMA alloc miss 72813148 63336406 0

More comprehensive results;

kernbench
3.13.0-rc4 3.13.0-rc4 3.13.0-rc4
vanilla latest-v5r3 nofairnuma-v5r3
User min 1414.50 ( 0.00%) 1414.50 ( 0.00%) 1405.07 ( 0.67%)
User mean 1417.28 ( 0.00%) 1418.02 ( -0.05%) 1409.90 ( 0.52%)
User stddev 2.36 ( 0.00%) 3.32 (-40.48%) 3.17 (-34.29%)
User max 1421.00 ( 0.00%) 1423.68 ( -0.19%) 1414.50 ( 0.46%)
User range 6.50 ( 0.00%) 9.18 (-41.23%) 9.43 (-45.08%)
System min 114.82 ( 0.00%) 115.24 ( -0.37%) 110.22 ( 4.01%)
System mean 116.03 ( 0.00%) 115.52 ( 0.44%) 110.52 ( 4.75%)
System stddev 0.77 ( 0.00%) 0.27 ( 64.47%) 0.25 ( 67.71%)
System max 117.15 ( 0.00%) 115.87 ( 1.09%) 110.82 ( 5.40%)
System range 2.33 ( 0.00%) 0.63 ( 72.96%) 0.60 ( 74.25%)
Elapsed min 42.92 ( 0.00%) 43.09 ( -0.40%) 43.08 ( -0.37%)
Elapsed mean 43.93 ( 0.00%) 43.88 ( 0.10%) 43.68 ( 0.56%)
Elapsed stddev 0.53 ( 0.00%) 0.51 ( 3.19%) 0.38 ( 29.13%)
Elapsed max 44.45 ( 0.00%) 44.60 ( -0.34%) 44.26 ( 0.43%)
Elapsed range 1.53 ( 0.00%) 1.51 ( 1.31%) 1.18 ( 22.88%)
CPU min 3443.00 ( 0.00%) 3430.00 ( 0.38%) 3438.00 ( 0.15%)
CPU mean 3490.40 ( 0.00%) 3494.20 ( -0.11%) 3480.60 ( 0.28%)
CPU stddev 47.79 ( 0.00%) 45.56 ( 4.66%) 29.45 ( 38.38%)
CPU max 3580.00 ( 0.00%) 3559.00 ( 0.59%) 3530.00 ( 1.40%)
CPU range 137.00 ( 0.00%) 129.00 ( 5.84%) 92.00 ( 32.85%)

3.13.0-rc4 3.13.0-rc4 3.13.0-rc4
vanilla latest-v5r3nofairnuma-v5r3
User 8523.56 8522.59 8484.37
System 705.44 702.19 671.60
Elapsed 310.46 307.89 305.01

Little obvious impact on the elapsed time for kernel builds but system
CPU usage is reduced

3.13.0-rc4 3.13.0-rc4 3.13.0-rc4
vanilla latest-v5r3nofairnuma-v5r3
NUMA alloc hit 73791065 74090594 93572436
NUMA alloc miss 20034206 19733525 0
NUMA interleave hit 0 0 0
NUMA alloc local 73791054 74090589 93572431
NUMA page range updates 5874649 5946004 5849646
NUMA huge PMD updates 132 158 125
NUMA PTE updates 5807197 5865266 5785771
NUMA hint faults 2385553 2403333 2378242
NUMA hint local faults 1782172 1810993 2157010
NUMA hint local percent 74 75 90
NUMA pages migrated 474970 482561 208555
AutoNUMA cost 11977 12067 11936

NUMA alloc misses are eliminated, NUMA hinting faults are more local.

vmr-stream
3.13.0-rc4 3.13.0-rc4 3.13.0-rc4
vanilla latest-v5r3 nofairnuma-v5r3
Add 5M 3798.00 ( 0.00%) 3794.10 ( -0.10%) 3964.79 ( 4.39%)
Copy 5M 3366.28 ( 0.00%) 3355.99 ( -0.31%) 3462.03 ( 2.84%)
Scale 5M 3161.81 ( 0.00%) 3162.40 ( 0.02%) 3381.38 ( 6.94%)
Triad 5M 3523.40 ( 0.00%) 3519.55 ( -0.11%) 3838.11 ( 8.93%)
Add 7M 3730.57 ( 0.00%) 3770.11 ( 1.06%) 3974.30 ( 6.53%)
Copy 7M 3280.22 ( 0.00%) 3344.28 ( 1.95%) 3459.75 ( 5.47%)
Scale 7M 3101.82 ( 0.00%) 3145.88 ( 1.42%) 3379.28 ( 8.94%)
Triad 7M 3477.74 ( 0.00%) 3503.28 ( 0.73%) 3836.03 ( 10.30%)
Add 8M 3798.02 ( 0.00%) 3783.36 ( -0.39%) 3973.07 ( 4.61%)
Copy 8M 3358.04 ( 0.00%) 3347.92 ( -0.30%) 3461.89 ( 3.09%)
Scale 8M 3181.53 ( 0.00%) 3148.60 ( -1.03%) 3377.80 ( 6.17%)
Triad 8M 3533.03 ( 0.00%) 3515.36 ( -0.50%) 3835.95 ( 8.57%)
Add 10M 3800.29 ( 0.00%) 3769.95 ( -0.80%) 3973.26 ( 4.55%)
Copy 10M 3359.85 ( 0.00%) 3350.60 ( -0.28%) 3462.44 ( 3.05%)
Scale 10M 3173.04 ( 0.00%) 3139.96 ( -1.04%) 3378.16 ( 6.46%)
Triad 10M 3534.24 ( 0.00%) 3504.22 ( -0.85%) 3834.26 ( 8.49%)
Add 14M 3795.17 ( 0.00%) 3801.05 ( 0.15%) 3972.96 ( 4.68%)
Copy 14M 3355.50 ( 0.00%) 3367.94 ( 0.37%) 3461.37 ( 3.16%)
Scale 14M 3160.35 ( 0.00%) 3167.24 ( 0.22%) 3379.54 ( 6.94%)
Triad 14M 3526.50 ( 0.00%) 3526.30 ( -0.01%) 3835.83 ( 8.77%)
Add 17M 3787.71 ( 0.00%) 3778.67 ( -0.24%) 3974.02 ( 4.92%)
Copy 17M 3353.88 ( 0.00%) 3351.53 ( -0.07%) 3459.39 ( 3.15%)
Scale 17M 3157.43 ( 0.00%) 3153.50 ( -0.12%) 3378.06 ( 6.99%)
Triad 17M 3519.97 ( 0.00%) 3510.38 ( -0.27%) 3836.31 ( 8.99%)
Add 21M 3813.99 ( 0.00%) 3791.22 ( -0.60%) 4013.64 ( 5.23%)
Copy 21M 3367.09 ( 0.00%) 3343.67 ( -0.70%) 3478.93 ( 3.32%)
Scale 21M 3170.85 ( 0.00%) 3142.76 ( -0.89%) 3394.94 ( 7.07%)
Triad 21M 3540.99 ( 0.00%) 3524.25 ( -0.47%) 3853.95 ( 8.84%)
Add 28M 3791.41 ( 0.00%) 3766.75 ( -0.65%) 4012.68 ( 5.84%)
Copy 28M 3351.51 ( 0.00%) 3334.26 ( -0.51%) 3480.01 ( 3.83%)
Scale 28M 3166.36 ( 0.00%) 3145.50 ( -0.66%) 3393.61 ( 7.18%)
Triad 28M 3524.64 ( 0.00%) 3501.64 ( -0.65%) 3854.84 ( 9.37%)
Add 35M 3804.79 ( 0.00%) 3793.25 ( -0.30%) 3971.91 ( 4.39%)
Copy 35M 3358.44 ( 0.00%) 3358.86 ( 0.01%) 3461.08 ( 3.06%)
Scale 35M 3158.44 ( 0.00%) 3165.12 ( 0.21%) 3380.14 ( 7.02%)
Triad 35M 3535.12 ( 0.00%) 3520.64 ( -0.41%) 3836.04 ( 8.51%)
Add 42M 3809.87 ( 0.00%) 3767.46 ( -1.11%) 3973.00 ( 4.28%)
Copy 42M 3355.79 ( 0.00%) 3341.04 ( -0.44%) 3462.68 ( 3.19%)
Scale 42M 3172.80 ( 0.00%) 3146.79 ( -0.82%) 3378.09 ( 6.47%)
Triad 42M 3540.69 ( 0.00%) 3501.69 ( -1.10%) 3838.21 ( 8.40%)
Add 56M 3785.86 ( 0.00%) 3791.85 ( 0.16%) 3972.55 ( 4.93%)
Copy 56M 3353.31 ( 0.00%) 3360.84 ( 0.22%) 3460.49 ( 3.20%)
Scale 56M 3165.67 ( 0.00%) 3159.44 ( -0.20%) 3378.81 ( 6.73%)
Triad 56M 3521.56 ( 0.00%) 3523.75 ( 0.06%) 3836.52 ( 8.94%)
Add 71M 3810.08 ( 0.00%) 3792.50 ( -0.46%) 3973.04 ( 4.28%)
Copy 71M 3368.62 ( 0.00%) 3347.75 ( -0.62%) 3462.05 ( 2.77%)
Scale 71M 3167.19 ( 0.00%) 3160.05 ( -0.23%) 3379.95 ( 6.72%)
Triad 71M 3534.10 ( 0.00%) 3524.17 ( -0.28%) 3837.87 ( 8.60%)
Add 85M 3752.25 ( 0.00%) 3767.85 ( 0.42%) 3973.19 ( 5.89%)
Copy 85M 3285.89 ( 0.00%) 3333.99 ( 1.46%) 3459.58 ( 5.29%)
Scale 85M 3103.94 ( 0.00%) 3145.25 ( 1.33%) 3379.54 ( 8.88%)
Triad 85M 3500.92 ( 0.00%) 3506.22 ( 0.15%) 3834.88 ( 9.54%)
Add 113M 3798.99 ( 0.00%) 3800.44 ( 0.04%) 4012.76 ( 5.63%)
Copy 113M 3366.05 ( 0.00%) 3354.30 ( -0.35%) 3480.66 ( 3.40%)
Scale 113M 3158.15 ( 0.00%) 3151.65 ( -0.21%) 3395.47 ( 7.51%)
Triad 113M 3527.05 ( 0.00%) 3523.88 ( -0.09%) 3854.47 ( 9.28%)
Add 142M 3796.64 ( 0.00%) 3773.35 ( -0.61%) 4014.26 ( 5.73%)
Copy 142M 3361.19 ( 0.00%) 3345.00 ( -0.48%) 3481.51 ( 3.58%)
Scale 142M 3161.73 ( 0.00%) 3135.12 ( -0.84%) 3395.00 ( 7.38%)
Triad 142M 3529.36 ( 0.00%) 3503.87 ( -0.72%) 3854.34 ( 9.21%)
Add 170M 3806.53 ( 0.00%) 3792.00 ( -0.38%) 3974.21 ( 4.41%)
Copy 170M 3357.36 ( 0.00%) 3352.38 ( -0.15%) 3461.62 ( 3.11%)
Scale 170M 3160.90 ( 0.00%) 3164.89 ( 0.13%) 3379.16 ( 6.91%)
Triad 170M 3535.49 ( 0.00%) 3523.51 ( -0.34%) 3836.48 ( 8.51%)
Add 227M 3796.53 ( 0.00%) 3767.46 ( -0.77%) 4011.06 ( 5.65%)
Copy 227M 3364.99 ( 0.00%) 3332.86 ( -0.95%) 3479.83 ( 3.41%)
Scale 227M 3164.12 ( 0.00%) 3152.09 ( -0.38%) 3397.28 ( 7.37%)
Triad 227M 3529.30 ( 0.00%) 3505.34 ( -0.68%) 3855.76 ( 9.25%)
Add 284M 3799.86 ( 0.00%) 3796.23 ( -0.10%) 4012.36 ( 5.59%)
Copy 284M 3367.08 ( 0.00%) 3358.22 ( -0.26%) 3477.79 ( 3.29%)
Scale 284M 3160.58 ( 0.00%) 3149.70 ( -0.34%) 3396.63 ( 7.47%)
Triad 284M 3528.66 ( 0.00%) 3523.20 ( -0.15%) 3854.18 ( 9.22%)
Add 341M 3806.61 ( 0.00%) 3772.41 ( -0.90%) 4001.81 ( 5.13%)
Copy 341M 3356.31 ( 0.00%) 3334.36 ( -0.65%) 3474.46 ( 3.52%)
Scale 341M 3161.30 ( 0.00%) 3146.39 ( -0.47%) 3392.10 ( 7.30%)
Triad 341M 3541.64 ( 0.00%) 3508.46 ( -0.94%) 3849.70 ( 8.70%)
Add 455M 3791.88 ( 0.00%) 3786.70 ( -0.14%) 4013.14 ( 5.84%)
Copy 455M 3351.42 ( 0.00%) 3355.09 ( 0.11%) 3478.80 ( 3.80%)
Scale 455M 3153.66 ( 0.00%) 3160.05 ( 0.20%) 3395.35 ( 7.66%)
Triad 455M 3523.62 ( 0.00%) 3519.40 ( -0.12%) 3855.58 ( 9.42%)
Add 568M 3799.33 ( 0.00%) 3798.96 ( -0.01%) 3971.49 ( 4.53%)
Copy 568M 3362.64 ( 0.00%) 3350.96 ( -0.35%) 3459.20 ( 2.87%)
Scale 568M 3173.04 ( 0.00%) 3150.35 ( -0.72%) 3380.60 ( 6.54%)
Triad 568M 3529.29 ( 0.00%) 3525.15 ( -0.12%) 3840.76 ( 8.83%)
Add 682M 3797.03 ( 0.00%) 3768.91 ( -0.74%) 3973.01 ( 4.63%)
Copy 682M 3354.72 ( 0.00%) 3339.98 ( -0.44%) 3460.17 ( 3.14%)
Scale 682M 3153.94 ( 0.00%) 3148.32 ( -0.18%) 3377.85 ( 7.10%)
Triad 682M 3526.96 ( 0.00%) 3505.39 ( -0.61%) 3836.54 ( 8.78%)
Add 910M 3795.70 ( 0.00%) 3788.86 ( -0.18%) 3992.89 ( 5.20%)
Copy 910M 3357.38 ( 0.00%) 3354.01 ( -0.10%) 3471.01 ( 3.38%)
Scale 910M 3166.96 ( 0.00%) 3162.86 ( -0.13%) 3386.16 ( 6.92%)
Triad 910M 3529.87 ( 0.00%) 3519.05 ( -0.31%) 3843.79 ( 8.89%)
Add 1137M 3792.85 ( 0.00%) 3771.51 ( -0.56%) 3971.01 ( 4.70%)
Copy 1137M 3352.91 ( 0.00%) 3346.70 ( -0.19%) 3461.26 ( 3.23%)
Scale 1137M 3169.37 ( 0.00%) 3142.76 ( -0.84%) 3379.83 ( 6.64%)
Triad 1137M 3530.16 ( 0.00%) 3504.29 ( -0.73%) 3838.27 ( 8.73%)
Add 1365M 3801.74 ( 0.00%) 3793.69 ( -0.21%) 4013.10 ( 5.56%)
Copy 1365M 3363.16 ( 0.00%) 3352.88 ( -0.31%) 3481.69 ( 3.52%)
Scale 1365M 3172.05 ( 0.00%) 3164.03 ( -0.25%) 3394.99 ( 7.03%)
Triad 1365M 3535.04 ( 0.00%) 3526.16 ( -0.25%) 3853.50 ( 9.01%)
Add 1820M 3787.56 ( 0.00%) 3802.84 ( 0.40%) 4013.34 ( 5.96%)
Copy 1820M 3353.48 ( 0.00%) 3359.98 ( 0.19%) 3479.53 ( 3.76%)
Scale 1820M 3159.06 ( 0.00%) 3158.76 ( -0.01%) 3396.10 ( 7.50%)
Triad 1820M 3523.10 ( 0.00%) 3533.46 ( 0.29%) 3855.48 ( 9.43%)
Add 2275M 3792.81 ( 0.00%) 3767.08 ( -0.68%) 3972.93 ( 4.75%)
Copy 2275M 3363.74 ( 0.00%) 3338.62 ( -0.75%) 3459.66 ( 2.85%)
Scale 2275M 3162.05 ( 0.00%) 3147.96 ( -0.45%) 3379.24 ( 6.87%)
Triad 2275M 3526.64 ( 0.00%) 3502.22 ( -0.69%) 3836.95 ( 8.80%)
Add 2730M 3787.14 ( 0.00%) 3790.99 ( 0.10%) 4010.43 ( 5.90%)
Copy 2730M 3350.74 ( 0.00%) 3357.16 ( 0.19%) 3480.68 ( 3.88%)
Scale 2730M 3152.45 ( 0.00%) 3161.72 ( 0.29%) 3396.36 ( 7.74%)
Triad 2730M 3518.34 ( 0.00%) 3519.35 ( 0.03%) 3853.71 ( 9.53%)
Add 3640M 3806.59 ( 0.00%) 3767.80 ( -1.02%) 4015.21 ( 5.48%)
Copy 3640M 3357.56 ( 0.00%) 3328.65 ( -0.86%) 3481.54 ( 3.69%)
Scale 3640M 3165.75 ( 0.00%) 3145.30 ( -0.65%) 3394.62 ( 7.23%)
Triad 3640M 3536.01 ( 0.00%) 3501.84 ( -0.97%) 3853.20 ( 8.97%)
Add 4551M 3781.56 ( 0.00%) 3796.88 ( 0.40%) 4004.59 ( 5.90%)
Copy 4551M 3351.02 ( 0.00%) 3351.90 ( 0.03%) 3468.85 ( 3.52%)
Scale 4551M 3155.90 ( 0.00%) 3157.51 ( 0.05%) 3385.93 ( 7.29%)
Triad 4551M 3515.12 ( 0.00%) 3523.09 ( 0.23%) 3844.51 ( 9.37%)
Add 5461M 3799.03 ( 0.00%) 3770.88 ( -0.74%) 3991.51 ( 5.07%)
Copy 5461M 3352.10 ( 0.00%) 3335.93 ( -0.48%) 3470.78 ( 3.54%)
Scale 5461M 3162.40 ( 0.00%) 3156.15 ( -0.20%) 3386.75 ( 7.09%)
Triad 5461M 3530.26 ( 0.00%) 3513.79 ( -0.47%) 3845.24 ( 8.92%)
Add 7281M 3782.84 ( 0.00%) 3773.49 ( -0.25%) 3969.95 ( 4.95%)
Copy 7281M 3346.04 ( 0.00%) 3334.83 ( -0.34%) 3459.87 ( 3.40%)
Scale 7281M 3162.56 ( 0.00%) 3156.58 ( -0.19%) 3380.31 ( 6.89%)
Triad 7281M 3519.31 ( 0.00%) 3508.26 ( -0.31%) 3837.59 ( 9.04%)
Add 9102M 3809.64 ( 0.00%) 3777.05 ( -0.86%) 3972.09 ( 4.26%)
Copy 9102M 3367.56 ( 0.00%) 3343.22 ( -0.72%) 3460.40 ( 2.76%)
Scale 9102M 3167.18 ( 0.00%) 3148.99 ( -0.57%) 3379.55 ( 6.71%)
Triad 9102M 3538.70 ( 0.00%) 3511.49 ( -0.77%) 3836.54 ( 8.42%)
Add 10922M 3787.40 ( 0.00%) 3795.31 ( 0.21%) 3964.15 ( 4.67%)
Copy 10922M 3346.43 ( 0.00%) 3350.26 ( 0.11%) 3461.49 ( 3.44%)
Scale 10922M 3168.93 ( 0.00%) 3157.68 ( -0.36%) 3383.08 ( 6.76%)
Triad 10922M 3526.01 ( 0.00%) 3525.31 ( -0.02%) 3838.86 ( 8.87%)
Add 14563M 3794.28 ( 0.00%) 3803.31 ( 0.24%) 3994.41 ( 5.27%)
Copy 14563M 3356.78 ( 0.00%) 3354.56 ( -0.07%) 3464.31 ( 3.20%)
Scale 14563M 3157.36 ( 0.00%) 3154.81 ( -0.08%) 3380.29 ( 7.06%)
Triad 14563M 3528.26 ( 0.00%) 3532.54 ( 0.12%) 3838.84 ( 8.80%)
Add 18204M 3792.91 ( 0.00%) 3765.64 ( -0.72%) 3970.11 ( 4.67%)
Copy 18204M 3359.59 ( 0.00%) 3334.04 ( -0.76%) 3460.66 ( 3.01%)
Scale 18204M 3153.11 ( 0.00%) 3149.93 ( -0.10%) 3380.03 ( 7.20%)
Triad 18204M 3521.49 ( 0.00%) 3504.64 ( -0.48%) 3836.75 ( 8.95%)
Add 21845M 3793.56 ( 0.00%) 3791.04 ( -0.07%) 3983.20 ( 5.00%)
Copy 21845M 3348.75 ( 0.00%) 3362.11 ( 0.40%) 3472.91 ( 3.71%)
Scale 21845M 3163.58 ( 0.00%) 3158.30 ( -0.17%) 3388.11 ( 7.10%)
Triad 21845M 3526.31 ( 0.00%) 3520.69 ( -0.16%) 3848.07 ( 9.12%)
Add 29127M 3796.59 ( 0.00%) 3792.54 ( -0.11%) 3972.66 ( 4.64%)
Copy 29127M 3358.66 ( 0.00%) 3346.16 ( -0.37%) 3461.94 ( 3.07%)
Scale 29127M 3162.06 ( 0.00%) 3144.85 ( -0.54%) 3380.66 ( 6.91%)
Triad 29127M 3530.26 ( 0.00%) 3520.10 ( -0.29%) 3835.62 ( 8.65%)
Add 36408M 3813.66 ( 0.00%) 3799.59 ( -0.37%) 3971.34 ( 4.13%)
Copy 36408M 3367.69 ( 0.00%) 3350.04 ( -0.52%) 3461.83 ( 2.80%)
Scale 36408M 3170.05 ( 0.00%) 3172.31 ( 0.07%) 3379.70 ( 6.61%)
Triad 36408M 3540.65 ( 0.00%) 3535.71 ( -0.14%) 3837.16 ( 8.37%)
Add 43690M 3798.39 ( 0.00%) 3758.19 ( -1.06%) 4014.44 ( 5.69%)
Copy 43690M 3353.06 ( 0.00%) 3337.86 ( -0.45%) 3478.91 ( 3.75%)
Scale 43690M 3161.25 ( 0.00%) 3148.81 ( -0.39%) 3394.53 ( 7.38%)
Triad 43690M 3529.36 ( 0.00%) 3496.36 ( -0.94%) 3852.56 ( 9.16%)
Add 58254M 3802.26 ( 0.00%) 3787.54 ( -0.39%) 4013.98 ( 5.57%)
Copy 58254M 3362.43 ( 0.00%) 3351.81 ( -0.32%) 3479.95 ( 3.50%)
Scale 58254M 3170.20 ( 0.00%) 3161.28 ( -0.28%) 3395.40 ( 7.10%)
Triad 58254M 3536.94 ( 0.00%) 3521.71 ( -0.43%) 3853.26 ( 8.94%)
Add 72817M 3791.47 ( 0.00%) 3792.54 ( 0.03%) 3969.94 ( 4.71%)
Copy 72817M 3356.57 ( 0.00%) 3352.09 ( -0.13%) 3460.17 ( 3.09%)
Scale 72817M 3157.90 ( 0.00%) 3156.49 ( -0.04%) 3378.86 ( 7.00%)
Triad 72817M 3524.35 ( 0.00%) 3517.28 ( -0.20%) 3838.25 ( 8.91%)
Add 87381M 3523.39 ( 0.00%) 3506.88 ( -0.47%) 3813.72 ( 8.24%)
Copy 87381M 3190.45 ( 0.00%) 3169.49 ( -0.66%) 3253.15 ( 1.97%)
Scale 87381M 2842.54 ( 0.00%) 2834.86 ( -0.27%) 3162.54 ( 11.26%)
Triad 87381M 3464.16 ( 0.00%) 3450.29 ( -0.40%) 3763.46 ( 8.64%)

This is a memory streaming benchmark that benefits heavily from using local memory.
System CPU usage is reduced as well as similar observations about NUMA misses
in vmstats and the locality of NUMA hinting faults.

pft
3.13.0-rc4 3.13.0-rc4 3.13.0-rc4
vanilla latest-v5r3 nofairnuma-v5r3
User 1 0.6710 ( 0.00%) 0.6780 ( -1.04%) 0.6540 ( 2.53%)
User 2 0.7070 ( 0.00%) 0.6530 ( 7.64%) 0.6980 ( 1.27%)
User 3 0.7330 ( 0.00%) 0.7090 ( 3.27%) 0.7160 ( 2.32%)
User 4 0.7150 ( 0.00%) 0.7360 ( -2.94%) 0.7690 ( -7.55%)
User 5 0.7630 ( 0.00%) 0.7700 ( -0.92%) 0.7960 ( -4.33%)
User 6 0.8070 ( 0.00%) 0.7910 ( 1.98%) 0.8040 ( 0.37%)
User 7 0.8110 ( 0.00%) 0.8570 ( -5.67%) 0.7880 ( 2.84%)
User 8 0.7930 ( 0.00%) 0.8050 ( -1.51%) 0.8080 ( -1.89%)
System 1 9.1070 ( 0.00%) 9.0330 ( 0.81%) 8.2500 ( 9.41%)
System 2 9.3600 ( 0.00%) 9.2440 ( 1.24%) 8.2860 ( 11.47%)
System 3 9.0810 ( 0.00%) 8.9210 ( 1.76%) 8.5590 ( 5.75%)
System 4 8.8590 ( 0.00%) 8.7850 ( 0.84%) 8.6400 ( 2.47%)
System 5 9.5320 ( 0.00%) 9.3480 ( 1.93%) 8.9360 ( 6.25%)
System 6 9.8130 ( 0.00%) 9.7910 ( 0.22%) 9.1830 ( 6.42%)
System 7 9.9470 ( 0.00%) 9.9100 ( 0.37%) 9.1550 ( 7.96%)
System 8 9.8450 ( 0.00%) 9.8180 ( 0.27%) 9.3150 ( 5.38%)
Elapsed 1 9.7810 ( 0.00%) 9.7170 ( 0.65%) 8.9070 ( 8.94%)
Elapsed 2 5.0490 ( 0.00%) 5.0120 ( 0.73%) 4.5640 ( 9.61%)
Elapsed 3 3.2930 ( 0.00%) 3.2570 ( 1.09%) 3.1060 ( 5.68%)
Elapsed 4 2.4250 ( 0.00%) 2.4120 ( 0.54%) 2.3770 ( 1.98%)
Elapsed 5 2.1400 ( 0.00%) 2.1020 ( 1.78%) 2.0130 ( 5.93%)
Elapsed 6 1.8310 ( 0.00%) 1.8210 ( 0.55%) 1.6990 ( 7.21%)
Elapsed 7 1.5780 ( 0.00%) 1.6000 ( -1.39%) 1.4610 ( 7.41%)
Elapsed 8 1.3570 ( 0.00%) 1.3440 ( 0.96%) 1.2980 ( 4.35%)
Faults/cpu 1 337939.0846 ( 0.00%) 340235.9696 ( 0.68%) 371187.5111 ( 9.84%)
Faults/cpu 2 328295.1589 ( 0.00%) 333950.4292 ( 1.72%) 367770.1269 ( 12.02%)
Faults/cpu 3 336740.2077 ( 0.00%) 343289.7536 ( 1.94%) 356429.4274 ( 5.85%)
Faults/cpu 4 345242.9687 ( 0.00%) 347153.8004 ( 0.55%) 351299.3686 ( 1.75%)
Faults/cpu 5 321235.7607 ( 0.00%) 326668.3097 ( 1.69%) 339716.8029 ( 5.75%)
Faults/cpu 6 311274.7429 ( 0.00%) 312409.6305 ( 0.36%) 331098.6934 ( 6.37%)
Faults/cpu 7 307505.0201 ( 0.00%) 306974.8841 ( -0.17%) 332412.6864 ( 8.10%)
Faults/cpu 8 310762.2478 ( 0.00%) 311220.4981 ( 0.15%) 326840.6273 ( 5.17%)
Faults/sec 1 337818.9673 ( 0.00%) 340054.4570 ( 0.66%) 371024.2076 ( 9.83%)
Faults/sec 2 654376.0180 ( 0.00%) 659535.4438 ( 0.79%) 724303.5620 ( 10.69%)
Faults/sec 3 1002998.9609 ( 0.00%) 1015206.6321 ( 1.22%) 1064424.6960 ( 6.12%)
Faults/sec 4 1362925.0535 ( 0.00%) 1370053.7380 ( 0.52%) 1390373.3128 ( 2.01%)
Faults/sec 5 1545330.0919 ( 0.00%) 1573054.8359 ( 1.79%) 1643071.0340 ( 6.32%)
Faults/sec 6 1806556.5143 ( 0.00%) 1816484.7109 ( 0.55%) 1946598.9551 ( 7.75%)
Faults/sec 7 2098145.9361 ( 0.00%) 2069188.7963 ( -1.38%) 2260923.0951 ( 7.76%)
Faults/sec 8 2436469.5049 ( 0.00%) 2460168.4672 ( 0.97%) 2551945.7388 ( 4.74%)

Big benefits again from using local memory.

ebizzy
3.13.0-rc4 3.13.0-rc4 3.13.0-rc4
vanilla latest-v5r3 nofairnuma-v5r3
Mean 1 3099.33 ( 0.00%) 3148.33 ( 1.58%) 3200.67 ( 3.27%)
Mean 2 2265.67 ( 0.00%) 2317.00 ( 2.27%) 2318.33 ( 2.32%)
Mean 3 2212.33 ( 0.00%) 2266.00 ( 2.43%) 2275.33 ( 2.85%)
Mean 4 2190.67 ( 0.00%) 2277.67 ( 3.97%) 2267.67 ( 3.51%)
Mean 5 2191.33 ( 0.00%) 2261.67 ( 3.21%) 2263.00 ( 3.27%)
Mean 6 2181.00 ( 0.00%) 2233.67 ( 2.41%) 2251.33 ( 3.22%)
Mean 7 2188.67 ( 0.00%) 2244.67 ( 2.56%) 2298.33 ( 5.01%)
Mean 8 2180.67 ( 0.00%) 2244.00 ( 2.90%) 2253.33 ( 3.33%)
Mean 12 2200.67 ( 0.00%) 2205.33 ( 0.21%) 2252.67 ( 2.36%)
Mean 16 2197.00 ( 0.00%) 2231.00 ( 1.55%) 2236.33 ( 1.79%)
Mean 20 2204.67 ( 0.00%) 2187.00 ( -0.80%) 2220.33 ( 0.71%)
Mean 24 2124.00 ( 0.00%) 2178.00 ( 2.54%) 2206.00 ( 3.86%)
Mean 28 2071.67 ( 0.00%) 2112.00 ( 1.95%) 2141.67 ( 3.38%)
Mean 32 2028.67 ( 0.00%) 2074.00 ( 2.23%) 2101.00 ( 3.57%)
Mean 36 1969.33 ( 0.00%) 2023.67 ( 2.76%) 2055.33 ( 4.37%)
Mean 40 1935.67 ( 0.00%) 1994.67 ( 3.05%) 2040.33 ( 5.41%)
Mean 44 1909.33 ( 0.00%) 1960.33 ( 2.67%) 2006.33 ( 5.08%)
Mean 48 1885.33 ( 0.00%) 1924.00 ( 2.05%) 2009.33 ( 6.58%)
Range 1 106.00 ( 0.00%) 10.00 ( 90.57%) 27.00 ( 74.53%)
Range 2 54.00 ( 0.00%) 69.00 (-27.78%) 9.00 ( 83.33%)
Range 3 35.00 ( 0.00%) 30.00 ( 14.29%) 37.00 ( -5.71%)
Range 4 16.00 ( 0.00%) 48.00 (-200.00%) 41.00 (-156.25%)
Range 5 3.00 ( 0.00%) 54.00 (-1700.00%) 11.00 (-266.67%)
Range 6 69.00 ( 0.00%) 18.00 ( 73.91%) 31.00 ( 55.07%)
Range 7 28.00 ( 0.00%) 137.00 (-389.29%) 66.00 (-135.71%)
Range 8 71.00 ( 0.00%) 22.00 ( 69.01%) 9.00 ( 87.32%)
Range 12 50.00 ( 0.00%) 39.00 ( 22.00%) 82.00 (-64.00%)
Range 16 37.00 ( 0.00%) 38.00 ( -2.70%) 42.00 (-13.51%)
Range 20 66.00 ( 0.00%) 28.00 ( 57.58%) 34.00 ( 48.48%)
Range 24 43.00 ( 0.00%) 114.00 (-165.12%) 22.00 ( 48.84%)
Range 28 20.00 ( 0.00%) 54.00 (-170.00%) 68.00 (-240.00%)
Range 32 17.00 ( 0.00%) 54.00 (-217.65%) 37.00 (-117.65%)
Range 36 14.00 ( 0.00%) 34.00 (-142.86%) 21.00 (-50.00%)
Range 40 10.00 ( 0.00%) 35.00 (-250.00%) 25.00 (-150.00%)
Range 44 19.00 ( 0.00%) 14.00 ( 26.32%) 17.00 ( 10.53%)
Range 48 8.00 ( 0.00%) 22.00 (-175.00%) 6.00 ( 25.00%)
Stddev 1 46.15 ( 0.00%) 4.50 ( 90.26%) 11.15 ( 75.85%)
Stddev 2 22.54 ( 0.00%) 29.63 (-31.44%) 3.68 ( 83.67%)
Stddev 3 14.61 ( 0.00%) 12.57 ( 13.99%) 15.20 ( -3.98%)
Stddev 4 7.54 ( 0.00%) 19.74 (-161.68%) 19.33 (-156.25%)
Stddev 5 1.25 ( 0.00%) 25.22 (-1922.37%) 4.97 (-298.21%)
Stddev 6 28.18 ( 0.00%) 7.41 ( 73.71%) 12.76 ( 54.71%)
Stddev 7 11.47 ( 0.00%) 56.07 (-388.88%) 30.00 (-161.59%)
Stddev 8 29.78 ( 0.00%) 9.42 ( 68.38%) 3.68 ( 87.64%)
Stddev 12 20.95 ( 0.00%) 16.21 ( 22.61%) 34.03 (-62.45%)
Stddev 16 15.25 ( 0.00%) 15.58 ( -2.13%) 18.73 (-22.81%)
Stddev 20 27.21 ( 0.00%) 12.19 ( 55.18%) 15.37 ( 43.51%)
Stddev 24 19.82 ( 0.00%) 51.15 (-158.11%) 8.98 ( 54.68%)
Stddev 28 8.81 ( 0.00%) 25.46 (-189.06%) 28.43 (-222.82%)
Stddev 32 6.94 ( 0.00%) 22.45 (-223.29%) 16.39 (-136.04%)
Stddev 36 5.73 ( 0.00%) 14.27 (-148.78%) 8.96 (-56.18%)
Stddev 40 4.50 ( 0.00%) 14.84 (-230.00%) 10.27 (-128.47%)
Stddev 44 7.85 ( 0.00%) 5.79 ( 26.17%) 6.94 ( 11.49%)
Stddev 48 3.77 ( 0.00%) 9.42 (-149.69%) 2.49 ( 33.86%)

Smallish gain but enough to be happy about.

mm/page_alloc.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)

--
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/