Re: RCU stall when using function_graph
From: Daniel Lezcano
Date: Tue Aug 01 2017 - 18:15:57 EST
On 02/08/2017 00:04, Paul E. McKenney wrote:
>> Hi Paul,
>>
>> I have been trying to set the function_graph tracer for ftrace and each time I
>> get a CPU stall.
>>
>> How to reproduce:
>> -----------------
>>
>> echo function_graph > /sys/kernel/debug/tracing/current_tracer
>>
>> This error appears with v4.13-rc3 and v4.12-rc6.
>>
>> Is it something already reported ?
>
> I have seen this sort of thing, but only when actually dumping the trace
> out, and I though those got fixed. You are seeing this just accumulating
> the trace?
No, just by changing the tracer. It is the first operation I do after
rebooting and it is reproducible each time. That happens on an ARM64
platform.
> These RCU CPU stall warnings usually occur when something grabs hold of
> a CPU for too long, as in 21 seconds or so. One way that they can happen
> is excessive lock contention, another is having the kernel run through
> too much data at one shot.
>
> Adding Steven Rostedt on CC for his thoughts.
>
> Thanx, Paul
>
>> [ ... ]
>>
>> [ 472.934986] INFO: rcu_preempt detected stalls on CPUs/tasks:
>> [ 472.941302] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=3195
>> [ 472.948982] (detected by 3, t=6502 jiffies, g=549, c=548, q=10)
>> [ 472.955180] Task dump for CPU 0:
>> [ 472.958513] swapper/0 R running task 0 0 0 0x00000002
>> [ 550.954786] INFO: rcu_preempt detected stalls on CPUs/tasks:
>> [ 550.960657] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=12926
>> [ 550.967897] (detected by 3, t=26007 jiffies, g=549, c=548, q=27)
>> [ 550.974181] Task dump for CPU 0:
>> [ 550.977513] swapper/0 R running task 0 0 0 0x00000002
>> [ 628.974788] INFO: rcu_preempt detected stalls on CPUs/tasks:
>> [ 628.980659] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=22666
>> [ 628.987900] (detected by 3, t=45512 jiffies, g=549, c=548, q=43)
>> [ 628.994184] Task dump for CPU 0:
>> [ 628.997518] swapper/0 R running task 0 0 0 0x00000002
>> [ 706.994787] INFO: rcu_preempt detected stalls on CPUs/tasks:
>> [ 707.000658] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=31785
>> [ 707.007899] (detected by 3, t=65017 jiffies, g=549, c=548, q=74)
>> [ 707.014183] Task dump for CPU 0:
>> [ 707.017516] swapper/0 R running task 0 0 0 0x00000002
>> [ 707.024902] rcu_preempt kthread starved for 1261 jiffies! g549 c548 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
>> [ 707.034548] rcu_preempt S 0 8 2 0x00000000
>> [ 726.049583] INFO: task jbd2/mmcblk1p2-:1120 blocked for more than 120 seconds.
>> [ 726.061772] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202
>> [ 726.074294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [ 726.089027] jbd2/mmcblk1p2- D 0 1120 2 0x00000000
>> [ 726.126737] INFO: task rs:main Q:Reg:2272 blocked for more than 120 seconds.
>> [ 726.140484] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202
>> [ 726.154515] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [ 726.170140] rs:main Q:Reg D 0 2272 1 0x00000000
>> [ 726.205626]
>> [ 726.205626] Showing all locks held in the system:
>> [ 726.272952]
>> [ 726.278194] =============================================
>> [ 726.278194]
>> [ 785.014779] INFO: rcu_preempt detected stalls on CPUs/tasks:
>> [ 785.020658] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=41387
>> [ 785.027905] (detected by 3, t=84524 jiffies, g=549, c=548, q=94)
>> [ 785.034194] Task dump for CPU 0:
>> [ 785.037532] swapper/0 R running task 0 0 0 0x00000002
>> [ 846.879753] INFO: task jbd2/mmcblk1p2-:1120 blocked for more than 120 seconds.
>> [ 846.892966] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202
>> [ 846.909597] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [ 846.926726] jbd2/mmcblk1p2- D 0 1120 2 0x00000000
>> [ 846.981356] INFO: task rs:main Q:Reg:2272 blocked for more than 120 seconds.
>> [ 847.001593] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202
>> [ 847.017642] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [ 847.034135] rs:main Q:Reg D 0 2272 1 0x00000000
>> [ 847.077209]
>> [ 847.077209] Showing all locks held in the system:
>> [ 847.180736]
>> [ 847.186753] =============================================
>> [ 847.186753]
>> [ 863.034778] INFO: rcu_preempt detected stalls on CPUs/tasks:
>> [ 863.040660] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=50881
>> [ 863.047907] (detected by 4, t=104030 jiffies, g=549, c=548, q=114)
>> [ 863.054373] Task dump for CPU 0:
>> [ 863.057711] swapper/0 R running task 0 0 0 0x00000002
>> [ 941.054780] INFO: rcu_preempt detected stalls on CPUs/tasks:
>> [ 941.060659] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=59726
>> [ 941.067903] (detected by 7, t=123535 jiffies, g=549, c=548, q=303)
>> [ 941.074370] Task dump for CPU 0:
>> [ 941.077708] swapper/0 R running task 0 0 0 0x00000002
>> [ 967.711682] INFO: task jbd2/mmcblk1p2-:1120 blocked for more than 120 seconds.
>> [ 967.725731] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202
>> [ 967.743347] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [ 967.765658] jbd2/mmcblk1p2- D 0 1120 2 0x00000000
>> [ 967.810595] INFO: task rs:main Q:Reg:2272 blocked for more than 120 seconds.
>> [ 967.825360] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202
>> [ 967.838595] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [ 967.854460] rs:main Q:Reg D 0 2272 1 0x00000000
>> [ 967.897804]
>> [ 967.897804] Showing all locks held in the system:
>> [ 967.997067]
>> [ 968.002570] =============================================
>>
>>
>> [ ... ]
>>
>> etc ...
>
--
<http://www.linaro.org/> Linaro.org â Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog