Re: Unusually high system CPU usage with recent kernels

From: Tibor Billes
Date: Sun Sep 08 2013 - 13:22:54 EST


> From: Paul E. McKenney Sent: 09/07/13 02:23 AM
> On Tue, Sep 03, 2013 at 03:16:07PM -0700, Paul E. McKenney wrote:
> > On Tue, Sep 03, 2013 at 11:11:01PM +0200, Tibor Billes wrote:
> > > > From: Paul E. McKenney Sent: 08/30/13 03:24 AM
> > > > On Tue, Aug 27, 2013 at 10:05:42PM +0200, Tibor Billes wrote:
> > > > > From: Paul E. McKenney Sent: 08/26/13 06:28 AM
> > > > > > Here is a patch that is more likely to help. I am testing it in parallel,
> > > > > > but figured I should send you a sneak preview.
> > > > >
> > > > > I tried it, but I don't see any difference in overall performance. The dstat
> > > > > also shows the same as before.
> > > > >
> > > > > But I did notice something. Occasionally there is an increase in userspace
> > > > > CPU usage, interrupts and context switches are dropping, and it really gets
> > > > > more work done (scons printed commands more frequently). ÂI checked that
> > > > > this behaviour is present without your patch, I just didn't notice this
> > > > > before. Maybe you can make some sense out of it.
> > > > >
> > > > > ----system---- ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- ----swap--- ------memory-usage----- -----virtual-memory----
> > > > >   time   |usr sys idl wai hiq siq| read Âwrit| recv Âsend| Âin  out | int  csw | used Âfree| used Âbuff Âcach Âfree|majpf minpf alloc Âfree
> > > > > 27-08 20:51:53| 23 Â62 Â 5 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1274 Â3102k| Â 0 Â7934M| 549M 56.0M Â491M 6698M| Â 0 Â Â28 Â 156 Â 159
> > > > > 27-08 20:51:54| 24 Â64 Â 1 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1317 Â3165k| Â 0 Â7934M| 549M 56.0M Â491M 6698M| Â 0 Â Â53 Â 189 Â 182
> > > > > 27-08 20:51:55| 33 Â50  6  2  9  0| 192k 1832k|  0   0 |  0   0 |1371 Â2442k|  0 Â7934M| 544M 56.0M Â492M 6702M|  0  Â30k  17k  17k
> > > > > 27-08 20:51:56| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1313 Â3220k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â21 Â 272 Â 232
> > > > > 27-08 20:51:57| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1319 Â3226k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â 8 Â Â96 Â 112
> > > > > 27-08 20:51:58| 25 Â63 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1317 Â3224k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â12 Â 145 Â 141
> > > > > 27-08 20:51:59| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1317 Â3223k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â54 Â 193 Â 191
> > > > > 27-08 20:52:00| 25 Â63 Â 0 Â 0 Â12 Â 0| Â 0 Â Â24k| Â 0 Â Â 0 | Â 0 Â Â 0 |1336 Â3216k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â36 Â 161 Â 172
> > > > > 27-08 20:52:01| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1313 Â3225k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â 9 Â 107 Â 107
> > > > > 27-08 20:52:02| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1327 Â3224k| Â 0 Â7934M| 545M 56.0M Â492M 6701M| Â 0 Â Â13 Â 193 Â 200
> > > > > 27-08 20:52:03| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1311 Â3226k| Â 0 Â7934M| 545M 56.0M Â492M 6701M| Â 0 Â Â13 Â 114 Â 114
> > > > > 27-08 20:52:04| 25 Â63 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1331 Â3223k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â41 Â 190 Â 178
> > > > > 27-08 20:52:05| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â8192B| Â 0 Â Â 0 | Â 0 Â Â 0 |1315 Â3222k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â30 Â 123 Â 122
> > > > > 27-08 20:52:06| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1314 Â3223k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â16 Â 187 Â 195
> > > > > 27-08 20:52:07| 25 Â63 Â 1 Â 0 Â12 Â 0|2212k Â192k| Â 0 Â Â 0 | Â 0 Â Â 0 |1637 Â3194k| Â 0 Â7934M| 544M 56.2M Â494M 6699M| Â 0 Â1363 Â2590 Â1947
> > > > > 27-08 20:52:08| 17 Â33 Â18 Â26  6  0|3208k  Â0 |  0   0 |  0   0 |1351 Â1766k|  0 Â7934M| 561M 56.3M Â499M 6678M|  4  Â10k 7620 Â2055
> > > > > 27-08 20:52:09| 47 Â21 Â16 Â13  4  0|4332k Â628k|  0   0 |  0   0 |1400 Â1081k|  0 Â7934M| 647M 56.3M Â504M 6587M| Â10  Â24k  25k 1151
> > > > > 27-08 20:52:10| 36 Â34 Â13 Â11  6  0|2636k 2820k|  0   0 |  0   0 |1451 Â1737k|  0 Â7934M| 598M 56.3M Â507M 6632M|  5  Â19k  16k  28k
> > > > > 27-08 20:52:11| 46 Â17 Â10 Â25 Â 3 Â 0|4288k Â536k| Â 0 Â Â 0 | Â 0 Â Â 0 |1386 Â 868k| Â 0 Â7934M| 613M 56.3M Â513M 6611M| Â24 Â Â13k 8908 Â3616
> > > > > 27-08 20:52:12| 53 Â33  5  4  5  0|4740k 3992k|  0   0 |  0   0 |1773 Â1464k|  0 Â7934M| 562M 56.6M Â521M 6654M|  0  Â36k  29k  40k
> > > > > 27-08 20:52:13| 60 Â34  0  1  6  0|4228k 1416k|  0   0 |  0   0 |1642 Â1670k|  0 Â7934M| 593M 56.6M Â526M 6618M|  0  Â36k  26k  17k
> > > > > 27-08 20:52:14| 53 Â37 Â 1 Â 3 Â 7 Â 0|3008k 1976k| Â 0 Â Â 0 | Â 0 Â Â 0 |1513 Â1972k| Â 0 Â7934M| 541M 56.6M Â529M 6668M| Â 0 Â Â10k 9986 Â Â23k
> > > > > 27-08 20:52:15| 55 Â34  1  4  6  0|3636k 1284k|  0   0 |  0   0 |1645 Â1688k|  0 Â7934M| 581M 56.6M Â535M 6622M|  0  Â43k  30k  18k
> > > > > 27-08 20:52:16| 57 Â30  5  2  5  0|4404k 2320k|  0   0 |  0   0 |1715 Â1489k|  0 Â7934M| 570M 56.6M Â541M 6627M|  0  Â39k  24k  26k
> > > > > 27-08 20:52:17| 50 Â35  3  7  6  0|2520k 1972k|  0   0 |  0   0 |1699 Â1598k|  0 Â7934M| 587M 56.6M Â554M 6596M|  0  Â65k  40k  32k
> > > > > 27-08 20:52:18| 52 Â40  2  1  7  0|1556k 1732k|  0   0 |  0   0 |1582 Â1865k|  0 Â7934M| 551M 56.6M Â567M 6619M|  0  Â35k  26k  33k
> > > > > 27-08 20:52:19| 26 Â62 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1351 Â3240k| Â 0 Â7934M| 551M 56.6M Â568M 6619M| Â 0 Â Â86 Â 440 Â 214
> > > > > 27-08 20:52:20| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â 108k| Â 0 Â Â 0 | Â 0 Â Â 0 |1392 Â3162k| Â 0 Â7934M| 555M 56.6M Â560M 6623M| Â 0 Â1801 Â1490 Â2672
> > > > > 27-08 20:52:21| 25 Â63 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1332 Â3198k| Â 0 Â7934M| 555M 56.6M Â560M 6623M| Â 0 Â Â50 Â 245 Â 255
> > > > > 27-08 20:52:22| 25 Â63 Â 1 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1350 Â3220k| Â 0 Â7934M| 556M 56.6M Â560M 6622M| Â 0 Â 755 Â 544 Â 286
> > > > > 27-08 20:52:23| 27 Â62 Â 1 Â 0 Â11 Â 0| Â 0 Â 272k| Â 0 Â Â 0 | Â 0 Â Â 0 |1323 Â3092k| Â 0 Â7934M| 551M 56.6M Â558M 6628M| Â 0 Â 341 Â1464 Â3085
> > > > > 27-08 20:52:24| 25 Â63 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1334 Â3197k| Â 0 Â7934M| 551M 56.6M Â559M 6627M| Â 0 Â Â63 Â 514 Â 273
> > > > > 27-08 20:52:25| 25 Â63 Â 0 Â 0 Â12 Â 0| Â 0 Â Â40k| Â 0 Â Â 0 | Â 0 Â Â 0 |1329 Â3243k| Â 0 Â7934M| 546M 56.6M Â558M 6633M| Â 0 Â 321 Â 160 Â1679
> > > > > 27-08 20:52:26| 39 Â50  1  1  9  0| Â48k Â644k|  0   0 |  0   0 |1500 Â2556k|  0 Â7934M| 552M 56.6M Â560M 6625M|  0  Â30k  14k  12k
> > > > > 27-08 20:52:27| 26 Â62 Â 1 Â 0 Â11 Â 0| Â 0 Â 192k| Â 0 Â Â 0 | Â 0 Â Â 0 |1380 Â3152k| Â 0 Â7934M| 553M 56.6M Â560M 6624M| Â 0 Â2370 Â 808 Â 718
> > > > > 27-08 20:52:28| 23 Â55 Â12 Â 0 Â10 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1247 Â2993k| Â 0 Â7934M| 553M 56.6M Â561M 6624M| Â 0 Â1060 Â 428 Â 241
> > > > > 27-08 20:52:29| 25 Â63 Â 1 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1318 Â3142k| Â 0 Â7934M| 554M 56.6M Â561M 6623M| Â 0 Â 663 Â 442 Â 198
> > > > > 27-08 20:52:30| 25 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â 100k| Â 0 Â Â 0 | Â 0 Â Â 0 |1316 Â3212k| Â 0 Â7934M| 554M 56.6M Â561M 6623M| Â 0 Â Â42 Â 187 Â 186
> > > > > 27-08 20:52:31| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â4096B| Â 0 Â Â 0 | Â 0 Â Â 0 |1309 Â3222k| Â 0 Â7934M| 554M 56.6M Â561M 6623M| Â 0 Â Â 9 Â Â85 Â Â85
> > > > > 27-08 20:52:32| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1297 Â3219k| Â 0 Â7934M| 554M 56.6M Â561M 6623M| Â 0 Â Â23 Â Â95 Â Â84
> > > > > 27-08 20:52:33| 25 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1339 Â3101k| Â 0 Â7934M| 555M 56.6M Â557M 6625M| Â 0 Â 923 Â1566 Â2176
> > > > > 27-08 20:52:34| 25 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1333 Â3095k| Â 0 Â7934M| 555M 56.6M Â559M 6623M| Â 0 Â 114 Â 701 Â 195
> > > > > 27-08 20:52:35| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1342 Â3036k| Â 0 Â7934M| 557M 56.6M Â568M 6613M| Â 0 Â 957 Â3225 Â 516
> > > > > 27-08 20:52:36| 26 Â60 Â 2 Â 0 Â11 Â 0| Â 0 Â Â28k| Â 0 Â Â 0 | Â 0 Â Â 0 |1340 Â3091k| Â 0 Â7934M| 555M 56.6M Â568M 6614M| Â 0 Â5304 Â5422 Â5609
> > > > > 27-08 20:52:37| 25 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1348 Â3260k| Â 0 Â7934M| 556M 56.6M Â565M 6617M| Â 0 Â Â30 Â 156 Â1073
> > > > > 27-08 20:52:38| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1314 Â3211k| Â 0 Â7934M| 556M 56.6M Â549M 6633M| Â 0 Â Â11 Â 105 Â4285
> > > > > 27-08 20:52:39| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1353 Â3031k| Â 0 Â7934M| 558M 56.6M Â559M 6620M| Â 0 Â 847 Â3866 Â 357
> > > > > 27-08 20:52:40| 26 Â63 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1309 Â3135k| Â 0 Â7934M| 569M 56.6M Â566M 6602M| Â 0 Â3940 Â5727 Â1288
> > > >
> > > > Interesting. The number of context switches drops during the time
> > > > that throughput improves. It would be good to find out what task(s)
> > > > are doing all of the context switches. One way to find the pids of the
> > > > top 20 context-switching tasks should be something like this:
> > > >
> > > > grep ctxt /proc/*/status | sort -k2nr | head -20
> > > >
> > > > You could use any number of methods to map back to the command.
> > > > When generating my last patch, I was assuming that ksoftirqd would be
> > > > the big offender. Of course, if it is something else, I should be
> > > > taking a different approach.
> > >
> > > I did some measuring of these context switches. I used the above command
> > > line to get the number of context switches before compiling and about 40
> > > seconds later.
> > >
> > > The good case is:
> > > file                    Âbefore Âafter  process name
> > > /proc/10/status:voluntary_ctxt_switches: Â Â38108 Â 54327 Â [rcu_sched]
> > > /proc/19/status:voluntary_ctxt_switches: Â Â31022 Â 38682 Â [ksoftirqd/2]
> > > /proc/14/status:voluntary_ctxt_switches: Â Â12947 Â 16914 Â [ksoftirqd/1]
> > > /proc/3/status:voluntary_ctxt_switches: Â Â 11061 Â 14885 Â [ksoftirqd/0]
> > > /proc/194/status:voluntary_ctxt_switches: Â 10044 Â 11660 Â [kworker/2:1H]
> > > /proc/1603/status:voluntary_ctxt_switches: Â7626 Â Â9593 Â Â/usr/bin/X
> > > /proc/24/status:voluntary_ctxt_switches: Â Â6415 Â Â9571 Â Â[ksoftirqd/3]
> > > /proc/20/status:voluntary_ctxt_switches: Â Â5317 Â Â8879 Â Â[kworker/2:0]
> > >
> > > The bad case is:
> > > file                      Âbefore   Âafter    process name
> > > /proc/3/status:voluntary_ctxt_switches: Â Â Â Â 82411972 Â Â98542227 Â Â[ksoftirqd/0]
> > > /proc/23/status:voluntary_ctxt_switches: Â Â Â Â79592040 Â Â94206823 Â Â[ksoftirqd/3]
> > > /proc/13/status:voluntary_ctxt_switches: Â Â Â Â79053870 Â Â93654161 Â Â[ksoftirqd/1]
> > > /proc/18/status:voluntary_ctxt_switches: Â Â Â Â78136449 Â Â92288688 Â Â[ksoftirqd/2]
> > > /proc/2293/status:nonvoluntary_ctxt_switches: Â 29038881 Â Â29038881 Â Âmate-panel
> > > /proc/2308/status:nonvoluntary_ctxt_switches: Â 26607661 Â Â26607661 Â Ânm-applet
> > > /proc/2317/status:nonvoluntary_ctxt_switches: Â 15494474 Â Â15494474 Â Â/usr/lib/polkit-mate/polkit-mate-authentication-agent-1
> > > /proc/2148/status:nonvoluntary_ctxt_switches: Â 13763674 Â Â13763674 Â Âx-session-manager
> > > /proc/2985/status:nonvoluntary_ctxt_switches: Â Â Â Â Â Â Â 12062706 Â Âpython /usr/bin/scons -j4
> > > /proc/2323/status:nonvoluntary_ctxt_switches: Â 11581510 Â Â11581510 Â Âmate-volume-control-applet
> > > /proc/2353/status:nonvoluntary_ctxt_switches: Â 9213436 Â Â 9213436 Â Â mate-power-manager
> > > /proc/2305/status:nonvoluntary_ctxt_switches: Â 8328471 Â Â 8328471 Â Â /usr/lib/matecomponent/matecomponent-activation-server
> > > /proc/3041/status:nonvoluntary_ctxt_switches: Â Â Â Â Â Â Â 7638312 Â Â sh -c "/opt/arm-2011.03/bin/arm-none-eabi-gcc" [...]
> > > /proc/3055/status:nonvoluntary_ctxt_switches: Â Â Â Â Â Â Â 4843977 Â Â sh -c "/opt/arm-2011.03/bin/arm-none-eabi-gcc" [...]
> > > /proc/2253/status:nonvoluntary_ctxt_switches: Â 4449918 Â Â 4449918 Â Â mate-keyring-daemon
> >
> > Thank you for the info!
> >
> > > The processes for the tables were partly collected by hand so they may
> > > not contain every relevant process. But what I see is that there are a
> > > lot more context switches in the bad case, quite many of these are non
> > > voluntary. The context switch increase is pretty much everywhere, not
> > > just ksoftirqd. I suspect the Mate related processes did their context
> > > switches during bootup/login, which is consistent with the bootup/login
> > > procedure being significantly slower (just like the compilation).
> >
> > Also, the non-ksoftirq context switches are nonvoluntary, which likely
> > means that some of their context switches were preemptions by ksoftirqd.
> >
> > OK, time to look at whatever else might be causing this...
>
> I am still unable to reproduce this, so am reduced to shooting in the
> dark. The attached patch could in theory help. If it doesn't, would
> you be willing to do a bit of RCU event tracing?
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> Ârcu: Throttle invoke_rcu_core() invocations due to non-lazy callbacks
> Â
> ÂIf a non-lazy callback arrives on a CPU that has previously gone idle
> Âwith no non-lazy callbacks, invoke_rcu_core() forces the RCU core to
> Ârun. However, it does not update the conditions, which could result
> Âin several closely spaced invocations of the RCU core, which in turn
> Âcould result in an excessively high context-switch rate and resulting
> Âhigh overhead.
> Â
> ÂThis commit therefore updates the ->all_lazy and ->nonlazy_posted_snap
> Âfields to prevent closely spaced invocations.
> Â
> ÂReported-by: Tibor Billes <tbilles@xxxxxxx>
> ÂSigned-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
>
> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index dc09fb0..9d445ab 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -1745,6 +1745,8 @@ static void rcu_prepare_for_idle(int cpu)
> Â */
> Â if (rdtp->all_lazy &&
> Â rdtp->nonlazy_posted != rdtp->nonlazy_posted_snap) {
> + rdtp->all_lazy = false;
> + rdtp->nonlazy_posted_snap = rdtp->nonlazy_posted;
> Â invoke_rcu_core();
> Â return;
> Â }

Good news Paul, the above patch did solve this issue :) I see no extra
context switches, no extra CPU usage and no extra compile time.

Any idea why couldn't you reproduce this? Why did it only hit my system?

Tibor
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/