irq_work_sync hangs

From: Daniel J Blueman
Date: Thu May 14 2015 - 05:47:02 EST


I've been finding intermittent hangs in irq_work_sync [1]. I'm able to reproduce this on 4.0.2-stock sporadically when starting and terminating 'perf top' while the system is saturated with a heavy 64-core HPC workload (NPB bt.D.x); the hanging occurs when perf tears down.

Sometimes, the hangs resolve after a few minutes. Adding debug, filtering printk work via testing for IRQ_WORK_LAZY shows a "sync id 17 end" without a matching "sync id 17 start"; this suggests that IRQ_WORK_LAZY was set in flags after the work was queued, and cleared before irq_work_sync finished (along with IRQ_WORK_BUSY), which seems unexpected.

Any tips from people familiar with the behaviour?

Thanks,
Daniel

-- [1] (see [2] for instrumentation)

[ 42.957711] sync id 0 start flags=0x0
[ 42.961433] sync id 0 end
[ 42.964114] sync id 0 start flags=0x0
[ 42.967795] sync id 0 end
[ 42.970533] sync id 0 start flags=0x0
[ 42.974202] sync id 0 end
[ 48.362363] queued id 0 cpu 1
[ 48.362363] queued id 1 cpu 1
[ 50.080873] sync id 0 start flags=0x0
[ 50.084540] sync id 0 end
[ 50.087184] sync id 0 start flags=0x0
[ 50.087199] run id 11 end flags=0x0
[ 50.087201] perf interrupt took too long (2901 > 156250), lowering kernel.perf_event_max_sample_rate to 800
[ 50.087202] run id 10 end flags=0x0
[ 50.115852] sync id 0 end
[ 50.118484] sync id 0 start flags=0x0
[ 50.122158] sync id 0 end
[ 50.124800] sync id 0 start flags=0x0
[ 50.128470] sync id 0 end
[ 50.131133] sync id 0 start flags=0x0
[ 50.134799] sync id 0 end
[ 50.137448] sync id 0 start flags=0x0
[ 50.141166] sync id 0 end
[ 50.143858] sync id 0 start flags=0x0
[ 50.147547] sync id 0 end
[ 50.150243] sync id 0 start flags=0x0
[ 50.154305] sync id 0 end
[ 50.161514] sync id 0 start flags=0x0
[ 50.166031] sync id 0 end
[ 50.169422] sync id 0 start flags=0x0
[ 50.173325] sync id 0 end
[ 50.178483] sync id 0 start flags=0x0
[ 50.182427] sync id 0 end
[ 50.185172] sync id 0 start flags=0x0
[ 50.188844] sync id 0 end
[ 50.223568] queued id 12 cpu 22
[ 50.250641] sync id 0 start flags=0x0
[ 50.258890] sync id 0 end
[ 50.263910] sync id 0 start flags=0x0
[ 50.267706] sync id 0 end
[ 50.271277] sync id 0 start flags=0x0
[ 50.275190] sync id 0 end
[ 50.280644] sync id 0 start flags=0x0
[ 50.284472] sync id 0 end
[ 50.287258] sync id 0 start flags=0x0
[ 50.291101] sync id 0 end
[ 50.298058] sync id 0 start flags=0x0
[ 50.301804] sync id 0 end
[ 50.304549] sync id 0 start flags=0x0
[ 50.308267] sync id 0 end
[ 50.311069] sync id 0 start flags=0x0
[ 50.314752] sync id 0 end
[ 50.317442] sync id 0 start flags=0x0
[ 50.321125] sync id 0 end
[ 50.323790] sync id 0 start flags=0x0
[ 50.327461] sync id 0 end
[ 50.330114] sync id 0 start flags=0x0
[ 50.330124] INFO: NMI handler (perf_ibs_nmi_handler) took too long to run: 2.042 msecs
[ 50.330126] run id 13 end flags=0x0
[ 50.345202] sync id 0 end
[ 50.347850] sync id 0 start flags=0x0
[ 50.351582] sync id 0 end
[ 50.354233] sync id 0 start flags=0x0
[ 50.357932] sync id 0 end
[ 50.360584] sync id 0 start flags=0x0
[ 50.364252] sync id 0 end
[ 50.366890] sync id 0 start flags=0x0
[ 50.370606] sync id 0 end
[ 50.373246] sync id 0 start flags=0x0
[ 50.376914] sync id 0 end
[ 50.379551] sync id 0 start flags=0x0
[ 50.383256] sync id 0 end
[ 50.385900] sync id 0 start flags=0x0
[ 50.389564] sync id 0 end
[ 50.392232] sync id 0 start flags=0x0
[ 50.395904] sync id 0 end
[ 50.398566] sync id 0 start flags=0x0
[ 50.402239] sync id 0 end
[ 50.420270] sync id 0 start flags=0x0
[ 50.423938] sync id 0 end
[ 50.426620] sync id 0 start flags=0x0
[ 50.430340] sync id 0 end
[ 50.433020] sync id 0 start flags=0x0
[ 50.436686] sync id 0 end
[ 50.439357] sync id 0 start flags=0x0
[ 50.443069] sync id 0 end
[ 50.480261] sync id 0 start flags=0x0
[ 50.483928] sync id 0 end
[ 50.486632] sync id 0 start flags=0x0
[ 50.490319] sync id 0 end
[ 50.493076] sync id 0 start flags=0x0
[ 50.496744] sync id 0 end
[ 50.499425] sync id 0 start flags=0x0
[ 50.503138] sync id 0 end
[ 50.520260] sync id 0 start flags=0x0
[ 50.523970] sync id 0 end
[ 50.526647] sync id 0 start flags=0x0
[ 50.530357] sync id 0 end
[ 50.533037] sync id 0 start flags=0x0
[ 50.536719] sync id 0 end
[ 50.539386] sync id 0 start flags=0x0
[ 50.543133] sync id 0 end
[ 50.580246] sync id 0 start flags=0x0
[ 50.583943] sync id 0 end
[ 50.586626] sync id 0 start flags=0x0
[ 50.590346] sync id 0 end
[ 50.593019] sync id 0 start flags=0x0
[ 50.596692] sync id 0 end
[ 50.599372] sync id 0 start flags=0x0
[ 50.603095] sync id 0 end
[ 50.630339] sync id 0 start flags=0x0
[ 50.634047] sync id 0 end
[ 50.636737] sync id 0 start flags=0x0
[ 50.640450] sync id 0 end
[ 50.643173] sync id 0 start flags=0x0
[ 50.646838] sync id 0 end
[ 50.649524] sync id 0 start flags=0x0
[ 50.653219] sync id 0 end
[ 50.680349] sync id 0 start flags=0x0
[ 50.684022] sync id 0 end
[ 50.686720] sync id 0 start flags=0x0
[ 50.690468] sync id 0 end
[ 50.693162] sync id 0 start flags=0x0
[ 50.696852] sync id 0 end
[ 50.699555] sync id 0 start flags=0x0
[ 50.703232] sync id 0 end
[ 50.730275] sync id 0 start flags=0x0
[ 50.733949] sync id 0 end
[ 50.736646] sync id 0 start flags=0x0
[ 50.740363] sync id 0 end
[ 50.743078] sync id 0 start flags=0x0
[ 50.746766] sync id 0 end
[ 50.749455] sync id 0 start flags=0x0
[ 50.753183] sync id 0 end
[ 50.780274] sync id 0 start flags=0x0
[ 50.783949] sync id 0 end
[ 50.786653] sync id 0 start flags=0x0
[ 50.790393] sync id 0 end
[ 50.793097] sync id 0 start flags=0x0
[ 50.796774] sync id 0 end
[ 50.799474] sync id 0 start flags=0x0
[ 50.803213] sync id 0 end
[ 54.887194] sync id 0 start flags=0x0
[ 54.891194] sync id 0 end
[ 54.895879] sync id 0 start flags=0x0
[ 54.899578] sync id 0 end
[ 54.912322] sync id 0 start flags=0x0
[ 54.916085] sync id 0 end
[ 66.145994] sync id 0 start flags=0x0
[ 66.149708] sync id 0 end
[ 66.152380] sync id 0 start flags=0x0
[ 66.156072] sync id 0 end
[ 66.158696] sync id 0 start flags=0x0
[ 66.162367] sync id 0 end
[ 66.165003] sync id 0 start flags=0x0
[ 66.168667] sync id 0 end
[ 66.171314] sync id 0 start flags=0x0
[ 66.175001] sync id 0 end
[ 66.177648] sync id 0 start flags=0x0
[ 66.181322] sync id 0 end
[ 66.184013] sync id 0 start flags=0x0
[ 66.187673] sync id 0 end
[ 66.190325] sync id 0 start flags=0x0
[ 66.194026] sync id 0 end
[ 66.196676] sync id 0 start flags=0x0
[ 66.200351] sync id 0 end
[ 66.203034] sync id 0 start flags=0x0
[ 66.206697] sync id 0 end
[ 66.209340] sync id 0 start flags=0x0
[ 66.213010] sync id 0 end
[ 66.215683] sync id 0 start flags=0x0
[ 66.219351] sync id 0 end
[ 66.222000] sync id 0 start flags=0x0
[ 66.225703] sync id 0 end
[ 66.228340] sync id 0 start flags=0x0
[ 66.232005] sync id 0 end
[ 66.234653] sync id 0 start flags=0x0
[ 66.238351] sync id 0 end
[ 66.240994] sync id 0 start flags=0x0
[ 66.244658] sync id 0 end
[ 66.247336] sync id 0 start flags=0x0
[ 66.251004] sync id 0 end
[ 66.253648] sync id 0 start flags=0x0
[ 66.257314] sync id 0 end
[ 66.259950] sync id 0 start flags=0x0
[ 66.270311] sync id 0 end
[ 66.272981] sync id 0 start flags=0x0
[ 66.276649] sync id 0 end
[ 66.279296] sync id 0 start flags=0x0
[ 66.282961] sync id 0 end
[ 66.285600] sync id 0 start flags=0x0
[ 66.289312] sync id 0 end
[ 66.291962] sync id 0 start flags=0x0
[ 66.295630] sync id 0 end
[ 66.298310] sync id 0 start flags=0x0
[ 66.301975] sync id 0 end
[ 66.304618] sync id 0 start flags=0x0
[ 66.308327] sync id 0 end
[ 66.310978] sync id 0 start flags=0x0
[ 66.314645] sync id 0 end
[ 66.317290] sync id 0 start flags=0x0
[ 66.320995] sync id 0 end
[ 66.323645] sync id 0 start flags=0x0
[ 66.327353] sync id 0 end
[ 66.329989] sync id 0 start flags=0x0
[ 66.333653] sync id 0 end
[ 66.336296] sync id 0 start flags=0x0
[ 66.339996] sync id 0 end
[ 66.342644] sync id 0 start flags=0x0
[ 66.346312] sync id 0 end
[ 66.348951] sync id 0 start flags=0x0
[ 66.352620] sync id 0 end
[ 66.355321] sync id 0 start flags=0x0
[ 66.359022] sync id 0 end
[ 66.361704] sync id 0 start flags=0x0
[ 66.365388] sync id 0 end
[ 66.368057] sync id 0 start flags=0x0
[ 66.371771] sync id 0 end
[ 66.374443] sync id 0 start flags=0x0
[ 66.378133] sync id 0 end
[ 66.380908] sync id 0 start flags=0x0
[ 66.384587] sync id 0 end
[ 66.387259] sync id 0 start flags=0x0
[ 66.390969] sync id 0 end
[ 66.393642] sync id 0 start flags=0x0
[ 66.397321] sync id 0 end
[ 66.399992] sync id 0 start flags=0x0
[ 66.403702] sync id 0 end
[ 66.406384] sync id 0 start flags=0x0
[ 66.410070] sync id 0 end
[ 66.412749] sync id 0 start flags=0x0
[ 66.416430] sync id 0 end
[ 66.419098] sync id 0 start flags=0x0
[ 66.422768] sync id 0 end
[ 66.425467] sync id 0 start flags=0x0
[ 66.429174] sync id 0 end
[ 66.431854] sync id 0 start flags=0x0
[ 66.435548] sync id 0 end
[ 66.438227] sync id 0 start flags=0x0
[ 66.441926] sync id 0 end
[ 66.444603] sync id 0 start flags=0x0
[ 66.448283] sync id 0 end
[ 66.450958] sync id 0 start flags=0x0
[ 66.454651] sync id 0 end
[ 66.457327] sync id 0 start flags=0x0
[ 66.461020] sync id 0 end
[ 66.463702] sync id 0 start flags=0x0
[ 66.467394] sync id 0 end
[ 66.470074] sync id 0 start flags=0x0
[ 66.473769] sync id 0 end
[ 66.476447] sync id 0 start flags=0x0
[ 66.480127] sync id 0 end
[ 66.482811] sync id 0 start flags=0x0
[ 66.486494] sync id 0 end
[ 66.489164] sync id 0 start flags=0x0
[ 66.492877] sync id 0 end
[ 66.495552] sync id 0 start flags=0x0
[ 66.499237] sync id 0 end
[ 66.501917] sync id 0 start flags=0x0
[ 66.505605] sync id 0 end
[ 66.508276] sync id 0 start flags=0x0
[ 66.511982] sync id 0 end
[ 66.514663] sync id 0 start flags=0x0
[ 66.518347] sync id 0 end
[ 66.521028] sync id 0 start flags=0x0
[ 66.524716] sync id 0 end
[ 66.527387] sync id 0 start flags=0x0
[ 66.531093] sync id 0 end
[ 66.533772] sync id 0 start flags=0x0
[ 66.537458] sync id 0 end
[ 66.540140] sync id 0 start flags=0x0
[ 66.543825] sync id 0 end
[ 66.546502] sync id 0 start flags=0x0
[ 66.550209] sync id 0 end
[ 66.552884] sync id 0 start flags=0x0
[ 66.556569] sync id 0 end
[ 89.984699] sync id 0 start flags=0x0
[ 89.988416] sync id 0 end
[ 89.991131] sync id 0 start flags=0x0
[ 89.994793] sync id 0 end
[ 89.997442] sync id 0 start flags=0x0
[ 90.001114] sync id 0 end
[ 96.133149] sync id 0 start flags=0x0
[ 96.136815] sync id 0 end
[ 96.139463] sync id 0 start flags=0x0
[ 96.143179] sync id 0 end
[ 96.145824] sync id 0 start flags=0x0
[ 96.149527] sync id 0 end
[ 96.152182] sync id 0 start flags=0x0
[ 96.155854] sync id 0 end
[ 96.158499] sync id 0 start flags=0x0
[ 96.162201] sync id 0 end
[ 96.164843] sync id 0 start flags=0x0
[ 96.168509] sync id 0 end
[ 96.171140] sync id 0 start flags=0x0
[ 96.174833] sync id 0 end
[ 96.177469] sync id 0 start flags=0x0
[ 96.181136] sync id 0 end
[ 96.183780] sync id 0 start flags=0x0
[ 96.187448] sync id 0 end
[ 96.190093] sync id 0 start flags=0x0
[ 96.193781] sync id 0 end
[ 96.196422] sync id 0 start flags=0x0
[ 96.200082] sync id 0 end
[ 96.202720] sync id 0 start flags=0x0
[ 96.206426] sync id 0 end
[ 96.209070] sync id 0 start flags=0x0
[ 96.212735] sync id 0 end
[ 96.215378] sync id 0 start flags=0x0
[ 96.219073] sync id 0 end
[ 96.221726] sync id 0 start flags=0x0
[ 96.225399] sync id 0 end
[ 96.228042] sync id 0 start flags=0x0
[ 96.231736] sync id 0 end
[ 96.234377] sync id 0 start flags=0x0
[ 96.238045] sync id 0 end
[ 96.240691] sync id 0 start flags=0x0
[ 96.244380] sync id 0 end
[ 96.247025] sync id 0 start flags=0x0
[ 96.250695] sync id 0 end
[ 96.253359] sync id 0 start flags=0x0
[ 96.257026] sync id 0 end
[ 96.259673] sync id 0 start flags=0x0
[ 96.263343] sync id 0 end
[ 96.265984] sync id 0 start flags=0x0
[ 96.269689] sync id 0 end
[ 96.272333] sync id 0 start flags=0x0
[ 96.275995] sync id 0 end
[ 96.278634] sync id 0 start flags=0x0
[ 96.282341] sync id 0 end
[ 96.284984] sync id 0 start flags=0x0
[ 96.288652] sync id 0 end
[ 96.291298] sync id 0 start flags=0x0
[ 96.294985] sync id 0 end
[ 96.297626] sync id 0 start flags=0x0
[ 96.306727] sync id 0 end
[ 96.309407] sync id 0 start flags=0x0
[ 96.313075] sync id 0 end
[ 96.315717] sync id 0 start flags=0x0
[ 96.319420] sync id 0 end
[ 96.322066] sync id 0 start flags=0x0
[ 96.325736] sync id 0 end
[ 96.328375] sync id 0 start flags=0x0
[ 96.332082] sync id 0 end
[ 96.334725] sync id 0 start flags=0x0
[ 96.338392] sync id 0 end
[ 96.341066] sync id 0 start flags=0x0
[ 96.344735] sync id 0 end
[ 96.347406] sync id 0 start flags=0x0
[ 96.351102] sync id 0 end
[ 96.353799] sync id 0 start flags=0x0
[ 96.357459] sync id 0 end
[ 96.360189] sync id 0 start flags=0x0
[ 96.363862] sync id 0 end
[ 96.366547] sync id 0 start flags=0x0
[ 96.370210] sync id 0 end
[ 96.372892] sync id 0 start flags=0x0
[ 96.376569] sync id 0 end
[ 96.379249] sync id 0 start flags=0x0
[ 96.382949] sync id 0 end
[ 96.385621] sync id 0 start flags=0x0
[ 96.389312] sync id 0 end
[ 96.391992] sync id 0 start flags=0x0
[ 96.395680] sync id 0 end
[ 96.398366] sync id 0 start flags=0x0
[ 96.402050] sync id 0 end
[ 96.404735] sync id 0 start flags=0x0
[ 96.408414] sync id 0 end
[ 96.411098] sync id 0 start flags=0x0
[ 96.414789] sync id 0 end
[ 96.417457] sync id 0 start flags=0x0
[ 96.421165] sync id 0 end
[ 96.423838] sync id 0 start flags=0x0
[ 96.427525] sync id 0 end
[ 96.430215] sync id 0 start flags=0x0
[ 96.433903] sync id 0 end
[ 96.436613] sync id 0 start flags=0x0
[ 96.440282] sync id 0 end
[ 96.442966] sync id 0 start flags=0x0
[ 96.446632] sync id 0 end
[ 96.449314] sync id 0 start flags=0x0
[ 96.453008] sync id 0 end
[ 96.455701] sync id 0 start flags=0x0
[ 96.459368] sync id 0 end
[ 96.462060] sync id 0 start flags=0x0
[ 96.465734] sync id 0 end
[ 96.468413] sync id 0 start flags=0x0
[ 96.472123] sync id 0 end
[ 96.474807] sync id 0 start flags=0x0
[ 96.478488] sync id 0 end
[ 96.481167] sync id 0 start flags=0x0
[ 96.484845] sync id 0 end
[ 96.487519] sync id 0 start flags=0x0
[ 96.491222] sync id 0 end
[ 96.493904] sync id 0 start flags=0x0
[ 96.497590] sync id 0 end
[ 96.500266] sync id 0 start flags=0x0
[ 96.503962] sync id 0 end
[ 96.506649] sync id 0 start flags=0x0
[ 96.510329] sync id 0 end
[ 96.513010] sync id 0 start flags=0x0
[ 96.516698] sync id 0 end
[ 96.519375] sync id 0 start flags=0x0
[ 96.523064] sync id 0 end
[ 96.525743] sync id 0 start flags=0x0
[ 96.529433] sync id 0 end
[ 96.532123] sync id 0 start flags=0x0
[ 96.535807] sync id 0 end
[ 96.538482] sync id 0 start flags=0x0
[ 96.542180] sync id 0 end
[ 120.153874] hrtimer: interrupt took 36705294 ns
[ 127.899082] sync id 0 start flags=0x0
[ 127.902990] sync id 0 end
[ 127.907714] sync id 0 start flags=0x0
[ 127.911843] sync id 0 end
[ 127.931209] sync id 0 start flags=0x0
[ 127.934946] sync id 0 end
[ 139.160032] queued id 14 cpu 0
[ 139.172883] queued id 15 cpu 18
[ 139.181679] queued id 16 cpu 2
[ 139.188863] queued id 17 cpu 32
[ 139.191187] queued id 18 cpu 20
[ 139.192250] queued id 19 cpu 26
[ 139.196460] queued id 20 cpu 28
[ 139.199247] queued id 21 cpu 5
[ 139.201744] queued id 22 cpu 3
[ 139.203930] queued id 23 cpu 7
[ 139.204327] queued id 24 cpu 1
[ 139.208413] queued id 25 cpu 22
[ 139.211233] queued id 26 cpu 11
[ 139.211703] queued id 27 cpu 16
[ 139.211779] queued id 28 cpu 30
[ 139.212355] queued id 30 cpu 6
[ 139.219907] queued id 29 cpu 34
[ 139.221969] queued id 31 cpu 12
[ 139.222921] queued id 32 cpu 21
[ 139.225538] queued id 33 cpu 17
[ 139.228792] queued id 34 cpu 23
[ 139.229079] queued id 35 cpu 38
[ 139.230891] queued id 36 cpu 9
[ 139.231900] queued id 37 cpu 14
[ 139.233258] queued id 38 cpu 8
[ 139.237469] queued id 39 cpu 10
[ 139.238076] queued id 40 cpu 37
[ 139.238383] queued id 41 cpu 36
[ 139.238940] queued id 42 cpu 15
[ 139.243346] queued id 43 cpu 19
[ 139.243551] queued id 44 cpu 33
[ 139.244297] queued id 45 cpu 24
[ 139.248611] queued id 46 cpu 4
[ 139.261041] queued id 47 cpu 29
[ 139.262479] queued id 48 cpu 13
[ 139.270502] queued id 49 cpu 31
[ 139.283937] queued id 50 cpu 25
[ 139.284946] queued id 51 cpu 48
[ 139.289477] queued id 52 cpu 27
[ 139.293538] queued id 53 cpu 35
[ 139.293222] queued id 54 cpu 52
[ 139.308200] queued id 54 cpu 39
[ 139.310656] queued id 55 cpu 50
[ 139.312429] queued id 56 cpu 51
[ 139.318852] queued id 57 cpu 41
[ 139.323168] queued id 58 cpu 54
[ 139.323640] queued id 59 cpu 53
[ 139.333683] queued id 60 cpu 42
[ 139.333761] queued id 61 cpu 44
[ 139.341994] queued id 62 cpu 43
[ 139.352690] queued id 63 cpu 47
[ 139.383769] queued id 64 cpu 40
[ 139.389155] queued id 65 cpu 45
[ 139.394256] queued id 66 cpu 49
[ 139.398477] queued id 67 cpu 46
[ 139.403971] queued id 68 cpu 58
[ 139.411613] queued id 69 cpu 60
[ 139.418758] queued id 70 cpu 59
[ 139.425070] queued id 71 cpu 55
[ 139.425238] queued id 72 cpu 61
[ 139.427661] queued id 73 cpu 62
[ 139.443378] queued id 74 cpu 56
[ 139.450999] queued id 75 cpu 57
[ 139.192788] queued id 76 cpu 63
[ 139.192788] queued id 77 cpu 32
[ 145.360434] queued id 78 cpu 32
[ 145.360511] run id 53 end flags=0x0
[ 147.239942] run id 44 end flags=0x0
[ 147.240966] run id 14 end flags=0x0
[ 147.324199] queued id 81 cpu 54
[ 147.543397] queued id 84 cpu 58
[ 148.639136] sync id 14 start flags=0x0
[ 148.642990] sync id 14 end
[ 148.645842] run id 24 end flags=0x0
[ 148.645860] sync id 24 start flags=0x0
[ 148.645872] sync id 24 end
[ 148.645969] run id 16 end flags=0x0
[ 148.645989] sync id 16 start flags=0x0
[ 148.645995] sync id 16 end
[ 148.646081] run id 22 end flags=0x0
[ 148.646102] sync id 22 start flags=0x0
[ 148.646110] sync id 22 end
[ 148.646194] run id 46 end flags=0x0
[ 148.646217] sync id 46 start flags=0x0
[ 148.646223] sync id 46 end
[ 148.646290] run id 21 end flags=0x0
[ 148.646306] sync id 21 start flags=0x0
[ 148.646311] sync id 21 end
[ 148.646394] run id 30 end flags=0x0
[ 148.646413] sync id 30 start flags=0x0
[ 148.646417] sync id 30 end
[ 148.646489] run id 23 end flags=0x0
[ 148.646508] sync id 23 start flags=0x0
[ 148.646519] sync id 23 end
[ 148.646614] run id 38 end flags=0x0
[ 148.646630] sync id 38 start flags=0x0
[ 148.646642] sync id 38 end
[ 148.646720] run id 36 end flags=0x0
[ 148.646734] sync id 36 start flags=0x0
[ 148.646740] sync id 36 end
[ 148.646821] run id 39 end flags=0x0
[ 148.646843] sync id 39 start flags=0x0
[ 148.646849] sync id 39 end
[ 148.646925] run id 26 end flags=0x0
[ 148.646943] sync id 26 start flags=0x0
[ 148.646950] sync id 26 end
[ 148.647048] run id 31 end flags=0x0
[ 148.647068] sync id 31 start flags=0x0
[ 148.647074] sync id 31 end
[ 148.647175] run id 48 end flags=0x0
[ 148.647193] sync id 48 start flags=0x0
[ 148.647204] sync id 48 end
[ 148.647269] run id 37 end flags=0x0
[ 148.647288] sync id 37 start flags=0x0
[ 148.647293] sync id 37 end
[ 148.647359] run id 42 end flags=0x0
[ 148.647380] sync id 42 start flags=0x0
[ 148.647384] sync id 42 end
[ 148.647477] run id 27 end flags=0x0
[ 148.647496] sync id 27 start flags=0x0
[ 148.647502] sync id 27 end
[ 148.647587] run id 33 end flags=0x0
[ 148.647609] sync id 33 start flags=0x0
[ 148.647617] sync id 33 end
[ 148.647722] run id 15 end flags=0x0
[ 148.647742] sync id 15 start flags=0x0
[ 148.647748] sync id 15 end
[ 148.647827] run id 43 end flags=0x0
[ 148.647843] sync id 43 start flags=0x0
[ 148.647849] sync id 43 end
[ 148.647933] run id 18 end flags=0x0
[ 148.647953] sync id 18 start flags=0x0
[ 148.647957] sync id 18 end
[ 148.648032] run id 32 end flags=0x0
[ 148.648053] sync id 32 start flags=0x0
[ 148.648060] sync id 32 end
[ 148.648162] run id 25 end flags=0x0
[ 148.648183] sync id 25 start flags=0x0
[ 148.648190] sync id 25 end
[ 148.648273] run id 34 end flags=0x0
[ 148.648288] sync id 34 start flags=0x0
[ 148.648296] sync id 34 end
[ 148.648382] run id 45 end flags=0x0
[ 148.648401] sync id 45 start flags=0x0
[ 148.648408] sync id 45 end
[ 148.648486] run id 50 end flags=0x0
[ 148.648509] sync id 50 start flags=0x0
[ 148.648517] sync id 50 end
[ 148.648644] run id 19 end flags=0x0
[ 148.648665] sync id 19 start flags=0x0
[ 148.648673] sync id 19 end
[ 148.648745] run id 52 end flags=0x0
[ 148.648765] sync id 52 start flags=0x0
[ 148.648768] sync id 52 end
[ 148.648852] run id 20 end flags=0x0
[ 148.648871] sync id 20 start flags=0x0
[ 148.648875] sync id 20 end
[ 148.648942] run id 47 end flags=0x0
[ 148.648960] sync id 47 start flags=0x0
[ 148.648968] sync id 47 end
[ 148.649055] run id 28 end flags=0x0
[ 148.649084] sync id 28 start flags=0x0
[ 148.649090] sync id 28 end
[ 148.649165] run id 49 end flags=0x0
[ 148.649188] sync id 49 start flags=0x0
[ 148.649195] sync id 49 end
[ 148.649388] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 294.941 msecs
[ 148.649397] run id 78 end flags=0x0
[ 148.649422] perf interrupt took too long (76443 > 1250000), lowering kernel.perf_event_max_sample_rate to 100
[ 148.649431] run id 80 end flags=0x0
[ 148.649436] run id 17 end flags=0x0
[ 148.649446] sync id 17 end
[ 148.649481] sync id 44 start flags=0x0
[ 148.649481] sync id 44 end
[ 268.622677] INFO: rcu_sched self-detected stall on CPU { 34} (t=12000 jiffies g=1200 c=1199 q=738)
[ 268.622677] Task dump for CPU 34:
[ 268.622677] perf R running task 0 3640 3400 0x00000008
[ 268.622677] ffffffff82239a00 ffff8813efc83da8 ffffffff810cb80e 0000000000000022
[ 268.622677] ffffffff82239a00 ffff8813efc83dc8 ffffffff810ce538 ffff8813efc83e08
[ 268.622677] 0000000000000023 ffff8813efc83df8 ffffffff810e9021 ffff8813efc93280
[ 268.622677] Call Trace:
[ 268.622677] <IRQ> [<ffffffff810cb80e>] sched_show_task+0xae/0x120
[ 268.622677] [<ffffffff810ce538>] dump_cpu_task+0x38/0x40
[ 268.622677] [<ffffffff810e9021>] rcu_dump_cpu_stacks+0x91/0xd0
[ 268.622677] [<ffffffff810ebc69>] rcu_check_callbacks+0x3f9/0x6d0
[ 268.622677] [<ffffffff810f488e>] ? update_wall_time+0x3fe/0x690
[ 268.622677] [<ffffffff810ee3d4>] update_process_times+0x34/0x60
[ 268.622677] [<ffffffff810fc8a1>] tick_sched_handle.isra.18+0x31/0x40
[ 268.622677] [<ffffffff810fc8ec>] tick_sched_timer+0x3c/0x80
[ 268.622677] [<ffffffff810eee32>] __run_hrtimer.isra.34+0x42/0x100
[ 268.622677] [<ffffffff810ef0fd>] hrtimer_interrupt+0xcd/0x1f0
[ 268.622677] [<ffffffff8105bd07>] local_apic_timer_interrupt+0x37/0x60
[ 268.622677] [<ffffffff8105c16c>] smp_apic_timer_interrupt+0x3c/0x50
[ 268.622677] [<ffffffff81b2d517>] apic_timer_interrupt+0x67/0x70
[ 268.622677] <EOI> [<ffffffff8111aaaa>] ? irq_work_sync+0x2a/0xb0
[ 268.622677] [<ffffffff81120256>] _free_event+0x16/0x150
[ 268.622677] [<ffffffff811204b1>] put_event+0xd1/0x120
[ 268.622677] [<ffffffff81120520>] perf_release+0x10/0x20
[ 268.622677] [<ffffffff81185962>] __fput+0xa2/0x200
[ 268.622677] [<ffffffff81185b09>] ____fput+0x9/0x10
[ 268.622677] [<ffffffff810c1ca7>] task_work_run+0xb7/0xf0
[ 268.622677] [<ffffffff8103bf28>] do_notify_resume+0x68/0x70
[ 268.622677] [<ffffffff81b2c731>] int_signal+0x12/0x17

-- [2]

--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -22,6 +22,7 @@

static DEFINE_PER_CPU(struct llist_head, raised_list);
static DEFINE_PER_CPU(struct llist_head, lazy_list);
+static unsigned long id;

/*
* Claim the entry so that no one else will poke at it.
@@ -40,8 +41,10 @@ static bool irq_work_claim(struct irq_work *work)
oflags = cmpxchg(&work->flags, flags, nflags);
if (oflags == flags)
break;
- if (oflags & IRQ_WORK_PENDING)
+ if (oflags & IRQ_WORK_PENDING) {
+ if (!(oflags & IRQ_WORK_BUSY)) pr_err("already queued id %lu\n", work->id);
return false;
+ }
flags = oflags;
cpu_relax();
}
@@ -71,10 +74,14 @@ bool irq_work_queue_on(struct irq_work *work, int cpu)
/* Arch remote IPI send/receive backend aren't NMI safe */
WARN_ON_ONCE(in_nmi());

+ if (!(work->flags & IRQ_WORK_LAZY)) work->id = id++;
+
/* Only queue if not already pending */
if (!irq_work_claim(work))
return false;

+ if (!(work->flags & IRQ_WORK_LAZY)) pr_err("queued id %lu, cpu %d->%d", work->id, smp_processor_id(), cpu);
+
if (llist_add(&work->llnode, &per_cpu(raised_list, cpu)))
arch_send_call_function_single_ipi(cpu);

@@ -86,6 +93,8 @@ EXPORT_SYMBOL_GPL(irq_work_queue_on);
/* Enqueue the irq work @work on the current CPU */
bool irq_work_queue(struct irq_work *work)
{
+ if (!(work->flags & IRQ_WORK_LAZY)) work->id = id++;
+
/* Only queue if not already pending */
if (!irq_work_claim(work))
return false;
@@ -93,6 +102,8 @@ bool irq_work_queue(struct irq_work *work)
/* Queue the entry and raise the IPI if needed. */
preempt_disable();

+ if (!(work->flags & IRQ_WORK_LAZY)) pr_err("queued id %lu cpu %d", work->id, smp_processor_id());
+
/* If the work is "lazy", handle it from next tick if any */
if (work->flags & IRQ_WORK_LAZY) {
if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) &&
@@ -153,12 +164,16 @@ static void irq_work_run_list(struct llist_head *list)
flags = work->flags & ~IRQ_WORK_PENDING;
xchg(&work->flags, flags);

+ if (!(work->flags & ~IRQ_WORK_LAZY))
+ pr_err("run id %lu start flags=0x%lx\n", work->id, work->flags);
work->func(work);
/*
* Clear the BUSY bit and return to the free state if
* no-one else claimed it meanwhile.
*/
(void)cmpxchg(&work->flags, flags, flags & ~IRQ_WORK_BUSY);
+ if (!(work->flags & IRQ_WORK_LAZY))
+ pr_err("run id %lu end flags=0x%lx\n", work->id, work->flags);
}
}

@@ -190,7 +205,13 @@ void irq_work_sync(struct irq_work *work)
{
WARN_ON_ONCE(irqs_disabled());

+ if (!(work->flags & ~IRQ_WORK_LAZY))
+ pr_err("sync id %lu start flags=0x%lx\n", work->id, work->flags);
+
while (work->flags & IRQ_WORK_BUSY)
cpu_relax();
+
+ if (!(work->flags & ~IRQ_WORK_LAZY))
+ pr_err("sync id %lu end\n", work->id);
}
EXPORT_SYMBOL_GPL(irq_work_sync);
--
Daniel J Blueman
Principal Software Engineer, Numascale AS

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/