Re: zram: per-cpu compression streams

From: Sergey Senozhatsky
Date: Tue Apr 26 2016 - 07:21:40 EST


Hello Minchan,

On (04/19/16 17:00), Minchan Kim wrote:
[..]
> I'm convinced now with your data. Super thanks!
> However, as you know, we need data how bad it is in heavy memory pressure.
> Maybe, you can test it with fio and backgound memory hogger,

it's really hard to produce stable test results when the system
is under mem pressure.

first, I modified zram to export the re-compression number
(put cpu stream and re-try handler allocation)

mm_stat for numjobs{1..10}. the number of re-compressions is in "< NUM>" format

3221225472 3221225472 3221225472 0 3221229568 0 0 < 6421>
3221225472 3221225472 3221225472 0 3221233664 0 0 < 6998>
3221225472 2912157607 2952802304 0 2952814592 0 84 < 7271>
3221225472 2893479936 2899120128 0 2899136512 0 156 < 8260>
3221217280 2886040814 2899099648 0 2899128320 0 78 < 8297>
3221225472 2880045056 2885693440 0 2885718016 0 54 < 7794>
3221213184 2877431364 2883756032 0 2883801088 0 144 < 7336>
3221225472 2873229312 2876096512 0 2876133376 0 28 < 8699>
3221213184 2870728008 2871693312 0 2871730176 0 30 < 8189>
2899095552 2899095552 2899095552 0 2899136512 78643 0 < 7485>

as we can see, the number of re-compressions can vary from 6421 to 8699.


the test:

-- 4 GB x86_64 box
-- zram 3GB, lzo
-- mem-hogger pre-faults 3GB of pages before the fio test
-- fio test has been modified to have 11% compression ratio (to increase the
chances of re-compressions)
-- buffer_compress_percentage=11
-- scramble_buffers=0


considering buffer_compress_percentage=11, the box was under somewhat
heavy pressure.

now, the results


fio stats

4 streams 8 streams per cpu
===========================================================
#jobs1
READ: 2411.4MB/s 2430.4MB/s 2440.4MB/s
READ: 2094.8MB/s 2002.7MB/s 2034.5MB/s
WRITE: 141571KB/s 140334KB/s 143542KB/s
WRITE: 712025KB/s 706111KB/s 745256KB/s
READ: 531014KB/s 525250KB/s 537547KB/s
WRITE: 530960KB/s 525197KB/s 537492KB/s
READ: 473577KB/s 470320KB/s 476880KB/s
WRITE: 473645KB/s 470387KB/s 476948KB/s
#jobs2
READ: 7897.2MB/s 8031.4MB/s 7968.9MB/s
READ: 6864.9MB/s 6803.2MB/s 6903.4MB/s
WRITE: 321386KB/s 314227KB/s 313101KB/s
WRITE: 1275.3MB/s 1245.6MB/s 1383.5MB/s
READ: 1035.5MB/s 1021.9MB/s 1098.4MB/s
WRITE: 1035.6MB/s 1021.1MB/s 1098.6MB/s
READ: 972014KB/s 952321KB/s 987.66MB/s
WRITE: 969792KB/s 950144KB/s 985.40MB/s
#jobs3
READ: 13260MB/s 13260MB/s 13222MB/s
READ: 11636MB/s 11636MB/s 11755MB/s
WRITE: 511500KB/s 507730KB/s 504959KB/s
WRITE: 1646.1MB/s 1673.9MB/s 1755.5MB/s
READ: 1389.5MB/s 1387.2MB/s 1479.6MB/s
WRITE: 1387.6MB/s 1385.3MB/s 1477.4MB/s
READ: 1286.8MB/s 1289.1MB/s 1377.3MB/s
WRITE: 1284.8MB/s 1287.1MB/s 1374.9MB/s
#jobs4
READ: 19851MB/s 20244MB/s 20344MB/s
READ: 17732MB/s 17835MB/s 18097MB/s
WRITE: 667776KB/s 655599KB/s 693464KB/s
WRITE: 2041.2MB/s 2072.6MB/s 2474.1MB/s
READ: 1770.1MB/s 1781.7MB/s 2035.5MB/s
WRITE: 1765.8MB/s 1777.3MB/s 2030.5MB/s
READ: 1641.6MB/s 1672.4MB/s 1892.5MB/s
WRITE: 1643.2MB/s 1674.2MB/s 1894.4MB/s
#jobs5
READ: 19468MB/s 18484MB/s 18439MB/s
READ: 17594MB/s 17757MB/s 17716MB/s
WRITE: 843266KB/s 859627KB/s 867928KB/s
WRITE: 1927.1MB/s 2041.8MB/s 2168.9MB/s
READ: 1718.6MB/s 1771.7MB/s 1963.5MB/s
WRITE: 1712.7MB/s 1765.6MB/s 1956.8MB/s
READ: 1705.3MB/s 1663.6MB/s 1767.3MB/s
WRITE: 1704.3MB/s 1662.6MB/s 1766.2MB/s
#jobs6
READ: 21583MB/s 21685MB/s 21483MB/s
READ: 19160MB/s 18432MB/s 18618MB/s
WRITE: 986276KB/s 1004.2MB/s 981.11MB/s
WRITE: 2013.6MB/s 1922.5MB/s 2429.1MB/s
READ: 1797.1MB/s 1678.9MB/s 2038.8MB/s
WRITE: 1794.8MB/s 1675.9MB/s 2035.2MB/s
READ: 1678.2MB/s 1632.5MB/s 1917.4MB/s
WRITE: 1673.9MB/s 1627.6MB/s 1911.6MB/s
#jobs7
READ: 20697MB/s 21677MB/s 21062MB/s
READ: 18781MB/s 18667MB/s 19338MB/s
WRITE: 1074.6MB/s 1099.8MB/s 1105.3MB/s
WRITE: 2100.7MB/s 2010.3MB/s 2598.7MB/s
READ: 1783.2MB/s 1710.2MB/s 2027.8MB/s
WRITE: 1784.3MB/s 1712.1MB/s 2029.6MB/s
READ: 1690.8MB/s 1620.6MB/s 1893.6MB/s
WRITE: 1681.4MB/s 1611.7MB/s 1883.7MB/s
#jobs8
READ: 19883MB/s 20827MB/s 20395MB/s
READ: 18562MB/s 18178MB/s 17822MB/s
WRITE: 1240.5MB/s 1307.3MB/s 1331.7MB/s
WRITE: 2132.1MB/s 2143.6MB/s 2564.9MB/s
READ: 1841.1MB/s 1831.1MB/s 2111.4MB/s
WRITE: 1843.1MB/s 1833.1MB/s 2113.4MB/s
READ: 1795.4MB/s 1778.6MB/s 2029.3MB/s
WRITE: 1791.4MB/s 1774.5MB/s 2024.5MB/s
#jobs9
READ: 18834MB/s 19470MB/s 19402MB/s
READ: 17988MB/s 18118MB/s 18531MB/s
WRITE: 1339.4MB/s 1441.2MB/s 1512.6MB/s
WRITE: 2102.4MB/s 2111.9MB/s 2478.8MB/s
READ: 1754.5MB/s 1777.3MB/s 2050.2MB/s
WRITE: 1753.9MB/s 1776.7MB/s 2049.5MB/s
READ: 1686.4MB/s 1698.2MB/s 1931.6MB/s
WRITE: 1684.1MB/s 1696.8MB/s 1929.1MB/s
#jobs10
READ: 19128MB/s 19517MB/s 19592MB/s
READ: 18177MB/s 17544MB/s 18221MB/s
WRITE: 1397.1MB/s 1567.4MB/s 1683.2MB/s
WRITE: 2151.9MB/s 2205.1MB/s 2642.6MB/s
READ: 1879.2MB/s 1907.3MB/s 2223.2MB/s
WRITE: 1878.5MB/s 1906.2MB/s 2222.8MB/s
READ: 1835.7MB/s 1837.9MB/s 2131.4MB/s
WRITE: 1838.6MB/s 1840.8MB/s 2134.8MB/s


perf stats

4 streams 8 streams per cpu
====================================================================================================================
jobs1
stalled-cycles-frontend 52,219,601,943 ( 55.87%) 53,406,899,652 ( 56.33%) 49,944,625,376 ( 56.27%)
stalled-cycles-backend 23,194,739,214 ( 24.82%) 24,397,423,796 ( 25.73%) 22,782,579,660 ( 25.67%)
instructions 86,078,512,819 ( 0.92) 86,235,354,709 ( 0.91) 80,378,845,354 ( 0.91)
branches 15,732,850,506 ( 532.108) 15,743,473,327 ( 522.592) 14,725,420,241 ( 523.425)
branch-misses 104,546,578 ( 0.66%) 107,847,818 ( 0.69%) 106,343,602 ( 0.72%)
jobs2
stalled-cycles-frontend 118,614,605,521 ( 59.74%) 113,520,838,279 ( 59.94%) 104,301,243,221 ( 59.06%)
stalled-cycles-backend 59,490,170,824 ( 29.96%) 56,518,872,622 ( 29.84%) 50,161,702,782 ( 28.40%)
instructions 169,663,993,572 ( 0.85) 160,959,388,344 ( 0.85) 153,541,182,646 ( 0.87)
branches 31,859,926,551 ( 497.945) 30,132,524,256 ( 494.660) 28,579,927,064 ( 503.079)
branch-misses 164,531,311 ( 0.52%) 163,509,596 ( 0.54%) 145,472,902 ( 0.51%)
jobs3
stalled-cycles-frontend 153,932,401,104 ( 60.86%) 158,470,334,291 ( 60.81%) 150,767,641,835 ( 59.21%)
stalled-cycles-backend 77,023,824,597 ( 30.45%) 79,673,952,089 ( 30.57%) 72,693,245,174 ( 28.55%)
instructions 197,452,119,661 ( 0.78) 204,116,060,906 ( 0.78) 207,832,729,315 ( 0.82)
branches 36,579,918,543 ( 404.660) 37,980,582,651 ( 406.326) 39,091,715,974 ( 428.559)
branch-misses 214,292,753 ( 0.59%) 215,861,282 ( 0.57%) 203,320,703 ( 0.52%)
jobs4
stalled-cycles-frontend 237,223,396,661 ( 64.22%) 227,572,336,186 ( 64.37%) 202,100,979,033 ( 61.41%)
stalled-cycles-backend 129,935,296,918 ( 35.17%) 124,957,172,193 ( 35.34%) 103,626,575,103 ( 31.49%)
instructions 270,083,196,348 ( 0.73) 257,652,752,109 ( 0.73) 259,773,237,031 ( 0.79)
branches 52,120,828,566 ( 391.426) 49,121,254,042 ( 385.647) 49,896,944,076 ( 420.532)
branch-misses 260,480,947 ( 0.50%) 254,957,745 ( 0.52%) 239,402,681 ( 0.48%)
jobs5
stalled-cycles-frontend 257,778,703,389 ( 64.89%) 265,688,762,182 ( 65.13%) 229,916,792,090 ( 61.41%)
stalled-cycles-backend 142,090,098,727 ( 35.77%) 147,101,411,510 ( 36.06%) 117,081,586,471 ( 31.27%)
instructions 291,859,438,730 ( 0.73) 298,380,653,546 ( 0.73) 302,840,047,693 ( 0.81)
branches 55,111,567,225 ( 385.905) 56,316,470,332 ( 383.545) 57,500,842,324 ( 428.083)
branch-misses 270,056,201 ( 0.49%) 269,400,845 ( 0.48%) 258,495,925 ( 0.45%)
jobs6
stalled-cycles-frontend 311,626,093,277 ( 65.61%) 314,291,595,576 ( 65.77%) 249,524,291,273 ( 61.39%)
stalled-cycles-backend 174,358,063,361 ( 36.71%) 177,312,195,233 ( 37.10%) 126,508,172,269 ( 31.13%)
instructions 345,271,436,105 ( 0.73) 346,679,577,246 ( 0.73) 333,258,054,473 ( 0.82)
branches 65,298,537,641 ( 381.664) 65,995,652,812 ( 383.717) 62,730,160,550 ( 428.999)
branch-misses 313,241,654 ( 0.48%) 307,876,772 ( 0.47%) 282,570,360 ( 0.45%)
jobs7
stalled-cycles-frontend 333,896,608,350 ( 64.68%) 349,165,441,969 ( 64.85%) 276,185,831,513 ( 59.95%)
stalled-cycles-backend 186,083,638,772 ( 36.05%) 197,000,957,906 ( 36.59%) 138,835,486,733 ( 30.14%)
instructions 388,707,023,219 ( 0.75) 404,347,465,692 ( 0.75) 394,078,203,426 ( 0.86)
branches 71,999,476,930 ( 387.008) 76,197,698,685 ( 392.759) 73,195,649,665 ( 440.914)
branch-misses 328,598,294 ( 0.46%) 323,895,230 ( 0.43%) 298,205,996 ( 0.41%)
jobs8
stalled-cycles-frontend 378,806,234,772 ( 66.73%) 369,453,970,323 ( 66.55%) 313,738,845,641 ( 62.55%)
stalled-cycles-backend 211,732,966,238 ( 37.30%) 207,691,463,546 ( 37.41%) 161,120,924,768 ( 32.12%)
instructions 406,674,721,912 ( 0.72) 401,922,649,599 ( 0.72) 405,830,823,213 ( 0.81)
branches 75,637,492,422 ( 369.371) 74,287,789,757 ( 371.226) 75,967,291,039 ( 420.260)
branch-misses 355,733,892 ( 0.47%) 328,972,387 ( 0.44%) 318,203,258 ( 0.42%)
jobs9
stalled-cycles-frontend 422,712,242,907 ( 66.39%) 417,293,429,710 ( 66.14%) 343,703,467,466 ( 61.35%)
stalled-cycles-backend 239,356,726,574 ( 37.59%) 231,725,068,834 ( 36.73%) 172,101,321,805 ( 30.72%)
instructions 465,964,470,967 ( 0.73) 468,561,486,803 ( 0.74) 474,119,504,255 ( 0.85)
branches 86,724,291,348 ( 377.755) 86,534,438,758 ( 380.374) 88,431,722,886 ( 437.939)
branch-misses 385,706,052 ( 0.44%) 360,946,347 ( 0.42%) 337,858,267 ( 0.38%)
jobs10
stalled-cycles-frontend 451,844,797,592 ( 67.24%) 435,099,070,573 ( 67.18%) 352,877,428,118 ( 62.18%)
stalled-cycles-backend 255,533,666,521 ( 38.03%) 249,295,276,734 ( 38.49%) 179,754,582,074 ( 31.67%)
instructions 472,331,884,636 ( 0.70) 458,948,698,965 ( 0.71) 464,131,768,633 ( 0.82)
branches 88,848,212,769 ( 366.556) 85,330,239,413 ( 365.282) 86,837,838,069 ( 424.329)
branch-misses 398,856,497 ( 0.45%) 359,532,394 ( 0.42%) 333,821,387 ( 0.38%)



perf reported execution time

4 streams 8 streams per cpu
====================================================================
seconds elapsed 41.359653597 43.131195776 40.961640812
seconds elapsed 37.778174380 38.681792299 38.368529861
seconds elapsed 38.367149768 39.368008799 37.687545579
seconds elapsed 40.402963748 39.177529033 36.205357101
seconds elapsed 44.145428970 43.251655348 41.810848146
seconds elapsed 49.344988495 49.951048242 44.270045250
seconds elapsed 53.865398777 54.271392367 48.824173559
seconds elapsed 57.028770416 56.228105290 51.332017545
seconds elapsed 62.931350164 61.251237873 55.977463074
seconds elapsed 67.088285633 63.544376242 57.690998344


-ss