Re: RCU stall when using function_graph
From: Paul E. McKenney
Date: Tue Aug 01 2017 - 18:04:14 EST
> Hi Paul,
>
> I have been trying to set the function_graph tracer for ftrace and each time I
> get a CPU stall.
>
> How to reproduce:
> -----------------
>
> echo function_graph > /sys/kernel/debug/tracing/current_tracer
>
> This error appears with v4.13-rc3 and v4.12-rc6.
>
> Is it something already reported ?
I have seen this sort of thing, but only when actually dumping the trace
out, and I though those got fixed. You are seeing this just accumulating
the trace?
These RCU CPU stall warnings usually occur when something grabs hold of
a CPU for too long, as in 21 seconds or so. One way that they can happen
is excessive lock contention, another is having the kernel run through
too much data at one shot.
Adding Steven Rostedt on CC for his thoughts.
Thanx, Paul
> [ ... ]
>
> [ 472.934986] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 472.941302] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=3195
> [ 472.948982] (detected by 3, t=6502 jiffies, g=549, c=548, q=10)
> [ 472.955180] Task dump for CPU 0:
> [ 472.958513] swapper/0 R running task 0 0 0 0x00000002
> [ 550.954786] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 550.960657] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=12926
> [ 550.967897] (detected by 3, t=26007 jiffies, g=549, c=548, q=27)
> [ 550.974181] Task dump for CPU 0:
> [ 550.977513] swapper/0 R running task 0 0 0 0x00000002
> [ 628.974788] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 628.980659] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=22666
> [ 628.987900] (detected by 3, t=45512 jiffies, g=549, c=548, q=43)
> [ 628.994184] Task dump for CPU 0:
> [ 628.997518] swapper/0 R running task 0 0 0 0x00000002
> [ 706.994787] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 707.000658] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=31785
> [ 707.007899] (detected by 3, t=65017 jiffies, g=549, c=548, q=74)
> [ 707.014183] Task dump for CPU 0:
> [ 707.017516] swapper/0 R running task 0 0 0 0x00000002
> [ 707.024902] rcu_preempt kthread starved for 1261 jiffies! g549 c548 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> [ 707.034548] rcu_preempt S 0 8 2 0x00000000
> [ 726.049583] INFO: task jbd2/mmcblk1p2-:1120 blocked for more than 120 seconds.
> [ 726.061772] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202
> [ 726.074294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 726.089027] jbd2/mmcblk1p2- D 0 1120 2 0x00000000
> [ 726.126737] INFO: task rs:main Q:Reg:2272 blocked for more than 120 seconds.
> [ 726.140484] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202
> [ 726.154515] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 726.170140] rs:main Q:Reg D 0 2272 1 0x00000000
> [ 726.205626]
> [ 726.205626] Showing all locks held in the system:
> [ 726.272952]
> [ 726.278194] =============================================
> [ 726.278194]
> [ 785.014779] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 785.020658] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=41387
> [ 785.027905] (detected by 3, t=84524 jiffies, g=549, c=548, q=94)
> [ 785.034194] Task dump for CPU 0:
> [ 785.037532] swapper/0 R running task 0 0 0 0x00000002
> [ 846.879753] INFO: task jbd2/mmcblk1p2-:1120 blocked for more than 120 seconds.
> [ 846.892966] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202
> [ 846.909597] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 846.926726] jbd2/mmcblk1p2- D 0 1120 2 0x00000000
> [ 846.981356] INFO: task rs:main Q:Reg:2272 blocked for more than 120 seconds.
> [ 847.001593] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202
> [ 847.017642] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 847.034135] rs:main Q:Reg D 0 2272 1 0x00000000
> [ 847.077209]
> [ 847.077209] Showing all locks held in the system:
> [ 847.180736]
> [ 847.186753] =============================================
> [ 847.186753]
> [ 863.034778] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 863.040660] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=50881
> [ 863.047907] (detected by 4, t=104030 jiffies, g=549, c=548, q=114)
> [ 863.054373] Task dump for CPU 0:
> [ 863.057711] swapper/0 R running task 0 0 0 0x00000002
> [ 941.054780] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 941.060659] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=59726
> [ 941.067903] (detected by 7, t=123535 jiffies, g=549, c=548, q=303)
> [ 941.074370] Task dump for CPU 0:
> [ 941.077708] swapper/0 R running task 0 0 0 0x00000002
> [ 967.711682] INFO: task jbd2/mmcblk1p2-:1120 blocked for more than 120 seconds.
> [ 967.725731] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202
> [ 967.743347] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 967.765658] jbd2/mmcblk1p2- D 0 1120 2 0x00000000
> [ 967.810595] INFO: task rs:main Q:Reg:2272 blocked for more than 120 seconds.
> [ 967.825360] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202
> [ 967.838595] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 967.854460] rs:main Q:Reg D 0 2272 1 0x00000000
> [ 967.897804]
> [ 967.897804] Showing all locks held in the system:
> [ 967.997067]
> [ 968.002570] =============================================
>
>
> [ ... ]
>
> etc ...