Re: Need your help on rcu_bh_state stall on CPU 0 (t=0 jiffies)
From: Paul E. McKenney
Date: Tue Feb 21 2017 - 03:06:48 EST
On Tue, Feb 21, 2017 at 11:36:42AM +0530, Sumit Gemini wrote:
> Thank Paul, you really save the day by giving your valuable input but I
> still confused in 2 things :-
>
>
> 1. If i suppress the rcu_bh_state() warning, will NMI not trigger?
> Because the problem is i don't know how to reproduce this issue. I observed
> this issue first time ever in my career and this issue occured at customer
> end so i have to give him a workaround.
Yes, suppressing the RCU CPU stall warning will also suppress the NMIs.
> 2. and one important thing, i have kernel stack traces which was
> occurred when stall issue observed. Many developer got the almost same
> stack-starce. similar this link "
> http://www.serverphorums.com/read.php?12,468996" in which Ralf and you
> were discussing. So can we not get any suspicious function or statement
> from this stack trace? or we need ftrace to debug this?
As described in stallwarn.txt (please do read this if you have not already
done so, and from your questions it does -not- seem like you have read
it), one cause of RCU CPU stall warnings is long loops in the kernel.
Adding cond_resched() in older kernels or (better) cond_resched_rcu_qs()
in newer kernels can prevent the stall warnings and other problems.
But calls to these functions must be carefully placed, because otherwise
you will get error messages such as "scheduling while atomic".
Thanx, Paul
> Thanks
> ~Sumit Gemini
>
> On Tue, Feb 21, 2017 at 8:31 AM, Paul E. McKenney <
> paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>
> > On Tue, Feb 21, 2017 at 06:36:34AM +0530, Sumit Gemini wrote:
> > > Hi Pual,
> > >
> > > Thanks for your time.
> > >
> > > we are using HA pair with TIPC and opensf functionality. Now problem is
> > on
> > > active machine TIPC link lost message was observed on 06:45:00 but
> > standby
> > > machine detected this link lost at 06:45:06 because at 06:45:00 standby
> > > machine having rcu_bh_state (CPU stall). Once standby machine detected
> > link
> > > lost, it fired reboot command for active machine with the help of opensf
> > at
> > > 06:45:07 to make itself active and standby to previously active machine.
> > >
> > > I raised this matter before TIPC open source community, and they told
> > > rcu_bh_state was the reason for delaying to get link down event at
> > standby
> > > end.
> > >
> > > In normal case TIPC link broken detects in < 2 second. But in our case
> > took
> > > 7 second because of that stall. It is first time, we observed such
> > > situation.
> > >
> > > Now if we suppres this warning, will we observe this delay again?
> >
> > That depends. If the delay was due to the NMIs and stack-trace printing,
> > suppressing the warning will make the delay go away. On the other hand,
> > if there really is a stall independent of the warning, then the delay
> > will still be there. Why not try it and find out?
> >
> > > Can you enlighten on this stall. Why does this stall happen at ideal
> > > machine? And if possible to debug it, what we need to do?
> > >
> > > Any pointers will be highly appreciated.
> >
> > I suggest reading Documentation/stallwarn.txt, especially the list
> > of possible causes near the end:
> >
> > https://www.kernel.org/doc/Documentation/RCU/stallwarn.txt
> >
> > If the delay persist when you suppress the warnings, then I suggest that
> > you use something like ftrace to gather information on the delay.
> >
> > Thanx, Paul
> >
> > > Thanks & Regards
> > > ~Sumit Gemini
> > >
> > >
> > > On Feb 21, 2017 2:28 AM, "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
> > > wrote:
> > >
> > > > On Mon, Feb 20, 2017 at 03:26:32PM +0530, Sumit Gemini wrote:
> > > > > Hi Paul,
> > > > >
> > > > >
> > > > > I got you while reading 'http://www.serverphorums.com/
> > read.php?12,468996".
> > > > I'm
> > > > > facing same problem but not able to detect what exectly the cause of
> > > > > it.
> > > > >
> > > > >
> > > > > I'm observed weird problem, I got rcu_bh_state on cpu0 but my system
> > > > > was in ideal state when i got rcu_bh_state (Cpu stall) problem. After
> > > > > 7 second my system recovered itself from stall to active. But i did
> > > > > not get what happen at that time.
> > > >
> > > > One thing that you might look into below...
> > > >
> > > > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.520060] INFO: rcu_bh_state
> > > > detected stall on CPU 0 (t=0 jiffies)
> > > > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] sending NMI to all
> > > > CPUs:
> > > > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] NMI backtrace for
> > cpu
> > > > 0
> > > > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] CPU 0
> > > > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] Modules linked in:
> > > > nf_conntrack_netlink af_packet xt_sharedlimit xt_hashlimit
> > > > ip_set_hash_ipport ip_set_hash_ipportip xt_NOTRACK ip_set_bitmap_port
> > > > xt_sctp nf_conntrack_ipv6 nf_defrag_ipv6 xt_CT arpt_mangle
> > > > ip_set_hash_ipnet xt_NFLOG nfnetlink_log ipt_ULOG xt_limit
> > xt_hashcounter
> > > > ip_set_hash_ipip xt_set ip_set_hash_ip deflate zlib_deflate ctr
> > > > twofish_x86_64 twofish_common camellia serpent blowfish cast5
> > des_generic
> > > > cbc xcbc rmd160 sha512_generic sha256_generic sha1_generic md5
> > crypto_null
> > > > af_key iptable_mangle ip_set nfnetlink arptable_filter arp_tables
> > > > iptable_raw iptable_nat tipc xt_tcpudp xt_state xt_pkttype bonding
> > > > binfmt_misc iptable_filter ip6table_filter ip6_tables nf_nat_ftp nf_nat
> > > > nf_conntrack_ftp nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables x_tables
> > mperf
> > > > edd ipmi_devintf ipmi_si ipmi_msghandler nf_conntrack_proto_sctp
> > > > nf_conntrack sctp 8021q garp stp llc gb_sys usb_storage ioatdma ixgbe
> > uas
> > > > sg igb iTCO_wdt wmi i2c_i801 pcspkr mdio iTCO_vendor_support button
> > > > container dca ipv6 autofs4 usbhid megasr(P) ehci_hcd usbcore processor
> > > > thermal_sys [last unloaded: ipt_PORTMAP]
> > > > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042]
> > > > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] Pid: 0, comm:
> > swapper
> > > > Tainted: P 3.1.10-gb17-default #1 Intel Corporation S2600CO/S2600CO
> > > > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] RIP:
> > > > 0010:[<ffffffff81007f51>] [<ffffffff81007f51>] native_read_tsc+0x2/0xf
> > > > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] RSP:
> > > > 0018:ffff88043ee03db0 EFLAGS: 00000803
> > > > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] RAX:
> > 0000000037185395
> > > > RBX: 00000000000003e9 RCX: 0000000000000001
> > > > > Jan 6 16:45:07 ffm-sbc-2a osafimmd[5035]: WA IMMND DOWN on active
> > > > controller f2 detected at standby immd!! f1. Possible failover
> > > > > Jan 6 16:45:07 ffm-sbc-2a osaffmd[5023]: NO Done Locking applications
> > > > node id:2020f ret val:0
> > > > > Jan 6 16:45:07 ffm-sbc-2a opensaf_recovery: Control interface
> > status:0
> > > > Role:STANDBY
> > > > > Jan 6 16:45:07 ffm-sbc-2a osaffmd[5023]: NO Current role: STANDBY
> > > > > Jan 6 16:45:07 ffm-sbc-2a osaffmd[5023]: Rebooting OpenSAF NodeId =
> > > > 131599 EE Name = , Reason: Received Node Down for peer controller, >
> > > > OwnNodeId = 131343, SupervisionTime = 60
> > > >
> > > > Is this "WA IMMND DOWN" a normal occurrence? Perhaps it is pointing
> > > > at a root cause.
> > > >
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] RDX:
> > 0000000000bf0977
> > > > RSI: 0000000000000002 RDI: 0000000000032bd4
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] RBP:
> > 0000000000032bd4
> > > > R08: 0000000000000000 R09: ffffffff819232b0
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] R10:
> > 7fffffffffffffff
> > > > R11: 7fffffffffffffff R12: 0000000000000000
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] R13:
> > ffffffff819232b0
> > > > R14: 0000000000000001 R15: ffffffff81065c28
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] FS:
> > > > 0000000000000000(0000) GS:ffff88043ee00000(0000) knlGS:0000000000000000
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] CS: 0010 DS: 0000
> > ES:
> > > > 0000 CR0: 000000008005003b
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] CR2:
> > 000000000069e034
> > > > CR3: 0000000001805000 CR4: 00000000000406f0
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] DR0:
> > 0000000000000000
> > > > DR1: 0000000000000000 DR2: 0000000000000000
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] DR3:
> > 0000000000000000
> > > > DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] Process swapper
> > (pid:
> > > > 0, threadinfo ffffffff81800000, task ffffffff8180d020)
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] Stack:
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] ffffffff81200eb5
> > > > ffffffff81200f44 00000000000003e9 0000000000001000
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] 0000000000000002
> > > > ffffffff819232b0 ffffffff81017698 7fffffffffffffff
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] 0000000000000002
> > > > 0000000000000002 ffffffff81017fdf 0000000000000001
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] Call Trace:
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff81200eb5>]
> > > > paravirt_read_tsc+0x5/0x8
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff81200f44>]
> > > > delay_tsc+0x1d/0x68
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff81017698>]
> > > > native_safe_apic_wait_icr_idle+0x27/0x32
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff81017fdf>]
> > > > default_send_IPI_dest_field.constprop.0+0x19/0x4d
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff8101804b>]
> > > > default_send_IPI_mask_sequence_phys+0x38/0x6a
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff8101815e>]
> > > > arch_trigger_all_cpu_backtrace+0x4d/0x7b
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff8109801b>]
> > > > check_cpu_stall+0x66/0xdb
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff810980aa>]
> > > > rcu_pending+0x1a/0x10a
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff8109852c>]
> > > > rcu_check_callbacks+0x9d/0xae
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff8104c56c>]
> > > > update_process_times+0x31/0x63
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff81065c92>]
> > > > tick_sched_timer+0x6a/0x90
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff8105b872>]
> > > > __run_hrtimer+0xa4/0x148
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff8105c08e>]
> > > > hrtimer_interrupt+0xdb/0x19a
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff81017768>]
> > > > smp_apic_timer_interrupt+0x6e/0x80
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff813efcde>]
> > > > apic_timer_interrupt+0x6e/0x80
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff81239655>]
> > > > intel_idle+0xdd/0x104
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff81304773>]
> > > > cpuidle_idle_call+0xda/0x169
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff81001200>]
> > > > cpu_idle+0x51/0x95
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff8193db0f>]
> > > > start_kernel+0x388/0x393
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042]
> > [<ffffffff8193d3af>]
> > > > x86_64_start_kernel+0xcf/0xdc
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] Code: 74 03 e6 80
> > c3
> > > > e6 ed c3 bf 8e 21 00 00 e9 ba 8f 1f 00 c3 90 90 90 40 88 f8 e6 70 e4
> > 71 c3
> > > > 40 88 f0 e6 70 40 88 f8 e6 71 c3 0f 31
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel[3167216.524042]: c1 48 89 d0 48 c1
> > e0
> > > > 20 48 09 c8 c3 41 56 41 55 41 54 55 53
> > > > >
> > > > > .
> > > > > .
> > > > > .
> > > > > .
> > > > > .
> > > > > .
> > > > >
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] NMI backtrace for
> > cpu
> > > > 31
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] CPU 31
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] Modules linked in:
> > > > nf_conntrack_netlink af_packet xt_sharedlimit xt_hashlimit
> > > > ip_set_hash_ipport ip_set_hash_ipportip xt_NOTRACK ip_set_bitmap_port
> > > > xt_sctp nf_conntrack_ipv6 nf_defrag_ipv6 xt_CT arpt_mangle
> > > > ip_set_hash_ipnet xt_NFLOG nfnetlink_log ipt_ULOG xt_limit
> > xt_hashcounter
> > > > ip_set_hash_ipip xt_set ip_set_hash_ip deflate ctr twofish_x86_64
> > > > twofish_common camellia serpent > blowfish cast5 des_generic cbc xcbc
> > > > rmd160 sha512_generic sha256_generic sha1_generic md5 crypto_null
> > af_key
> > > > iptable_mangle ip_set nfnetlink arptable_filter arp_tables iptable_raw
> > > > iptable_nat tipc xt_tcpudp xt_state xt_pkttype bonding binfmt_misc
> > > > iptable_filter ip6table_filter ip6_tables nf_nat_ftp nf_nat
> > > > nf_conntrack_ftp nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables x_tables
> > mperf
> > > > edd ipmi_devintf ipmi_si ipmi_msghandler nf_conntrack_proto_sctp
> > > > nf_conntrack sctp 8021q garp stp llc gb_sys usb_storage ioatdma ixgbe
> > uas
> > > > sg igb iTCO_wdt wmi i2c_i801 pcspkr mdio iTCO_vendor_support button
> > > > container dca ipv6 autofs4 usbhid megasr(P) ehci_hcd usbcore processor
> > > > thermal_sys [last unloaded: ipt_PORTMAP]
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734]
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] Pid: 0, comm:
> > > > kworker/0:1 Tainted: P 3.1.10-gb17-default #1 Intel Corporation
> > > > S2600CO/S2600CO
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] RIP:
> > > > 0010:[<ffffffff81239624>] [<ffffffff81239624>] intel_idle+0xac/0x104
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] RSP:
> > > > 0018:ffff880425e33eb8 EFLAGS: 00000046
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] RAX:
> > 0000000000000030
> > > > RBX: 0000000000000010 RCX: 0000000000000001
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] RDX:
> > 0000000000000000
> > > > RSI: ffff880425e33fd8 RDI: ffffffff81810500
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] RBP:
> > 0000000000000030
> > > > R08: 000000000000006d R09: 0000000000034b10
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] R10:
> > ffff88083eded830
> > > > R11: ffff88083eded830 R12: 149739342cb2ca49
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] R13:
> > 0000000000000004
> > > > R14: 000000000000001f R15: 0000000000000000
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] FS:
> > > > 0000000000000000(0000) GS:ffff88083ede0000(0000) knlGS:0000000000000000
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] CS: 0010 DS: 0000
> > ES:
> > > > 0000 CR0: 000000008005003b
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] CR2:
> > 00007fe89df67120
> > > > CR3: 0000000001805000 CR4: 00000000000406e0
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] DR0:
> > 0000000000000000
> > > > DR1: 0000000000000000 DR2: 0000000000000000
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] DR3:
> > 0000000000000000
> > > > DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] Process
> > kworker/0:1
> > > > (pid: 0, threadinfo ffff880425e32000, task ffff880425e30580)
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] Stack:
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.685484] TIPC: Resetting
> > link
> > > > <1.1.1:bond0-1.1.2:bond0>, requested by peer
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.685487] TIPC: Lost link
> > > > <1.1.1:bond0-1.1.2:bond0> on network plane A
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.685491] TIPC: Lost contact
> > > > with <1.1.2>
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] 0000000000000000
> > > > 000000000cdd3a47 0000000000000000 000000000cdd3a47
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] ffff880425e33fd8
> > > > 0000001f3edf8970 ffff88083edf8970 ffff88083edf8b00
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] 0000000000000000
> > > > ffffffff81304773 ffffffff819232b0 ffff880425e33fd8
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] Call Trace:
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214]
> > [<ffffffff81304773>]
> > > > cpuidle_idle_call+0xda/0x169
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214]
> > [<ffffffff81001200>]
> > > > cpu_idle+0x51/0x95
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] Code: 28 e0 ff ff
> > 80
> > > > e2 08 75 22 31 d2 48 83 c0 10 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38
> > e0 ff
> > > > ff a8 08 75 08 b1 01 48 89 e8 0f 01 c9 <e8> 3f 6e e2 ff 4c 29 e0 48 89
> > c7
> > > > e8 10 ae e0 ff 48 69 e8 40 42
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] Call Trace:
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214]
> > [<ffffffff81304773>]
> > > > cpuidle_idle_call+0xda/0x169
> > > > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214]
> > [<ffffffff81001200>]
> > > > cpu_idle+0x51/0x95
> > > >
> > > > Since NMIs were sent to all CPUs, it is not surprising that one was
> > found
> > > > to be idle. Or is there something that I am missing in CPU 31's
> > > > backtrace?
> > > >
> > > > > I'm using suse kernel 3.1.10
> > > >
> > > > That is going back a ways!
> > > >
> > > > There are quite a few patches to RCU CPU stall warnings that could
> > > > be applied. It might be easier to move to a newer kernel.
> > > >
> > > > Alternatively, if you are convinced that these stall warnings are
> > > > false positives, you could boot with the rcu_cpu_stall_suppress
> > > > kernel boot parameter set, or rebuild the kernel with a much larger
> > > > value of the RCU_CPU_STALL_TIMEOUT Kconfig option.
> > > >
> > > > Thanx, Paul
> > > >
> > > >
> >
> >