[mainline][ppc][bnx2x] watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70 when module load/unload
From: Abdul Haleem
Date: Mon Sep 24 2018 - 04:57:05 EST
Greeting's
bnx2x module load/unload test results in continuous hard LOCKUP trace on
my powerpc bare-metal running mainline 4.19.0-rc4 kernel
the instruction address points to:
0xc00000000009d048 is in opal_interrupt
(arch/powerpc/platforms/powernv/opal-irqchip.c:133).
128
129 static irqreturn_t opal_interrupt(int irq, void *data)
130 {
131 __be64 events;
132
133 opal_handle_interrupt(virq_to_hw(irq), &events);
134 last_outstanding_events = be64_to_cpu(events);
135 if (opal_have_pending_events())
136 opal_wake_poller();
137
trace:
bnx2x 0008:01:00.3 enP8p1s0f3: renamed from eth0
bnx2x 0008:01:00.3 enP8p1s0f3: using MSI-X IRQs: sp 297 fp[0] 299 ... fp[7] 306
bnx2x 0008:01:00.2 enP8p1s0f2: NIC Link is Up, 1000 Mbps full duplex, Flow control: none
bnx2x 0008:01:00.3 enP8p1s0f3: NIC Link is Up, 1000 Mbps full duplex, Flow control: none
bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.712.30-0 (2014/02/10)
bnx2x 0008:01:00.0: msix capability found
bnx2x 0008:01:00.0: Using 64-bit DMA iommu bypass
bnx2x 0008:01:00.0: part number 0-0-0-0
bnx2x 0008:01:00.0: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
bnx2x 0008:01:00.0 enP8p1s0f0: renamed from eth0
bnx2x 0008:01:00.1: msix capability found
bnx2x 0008:01:00.1: Using 64-bit DMA iommu bypass
bnx2x 0008:01:00.1: part number 0-0-0-0
bnx2x 0008:01:00.0 enP8p1s0f0: using MSI-X IRQs: sp 267 fp[0] 269 ... fp[7] 276
bnx2x 0008:01:00.0 enP8p1s0f0: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit
bnx2x 0008:01:00.1: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
bnx2x 0008:01:00.1 enP8p1s0f1: renamed from eth0
bnx2x 0008:01:00.2: msix capability found
bnx2x 0008:01:00.2: Using 64-bit DMA iommu bypass
bnx2x 0008:01:00.2: part number 0-0-0-0
bnx2x 0008:01:00.1 enP8p1s0f1: using MSI-X IRQs: sp 277 fp[0] 279 ... fp[7] 286
bnx2x 0008:01:00.1 enP8p1s0f1: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit
watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70
watchdog: CPU 80 TB:980794111093, last heartbeat TB:973959617200 (13348ms ago)
Modules linked in: bnx2x(+) iptable_mangle ipt_MASQUERADE iptable_nat
nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv4 ipt_REJECT
nf_reject_ipv4 xt_tcpudp tun bridge stp llc iptable_filter dm_mirror
dm_region_hash dm_log dm_service_time vmx_crypto powernv_rng rng_core
dm_multipath kvm_hv kvm binfmt_misc nfsd ip_tables x_tables autofs4 xfs
lpfc crc_t10dif crct10dif_generic nvme_fc nvme_fabrics mdio libcrc32c
nvme_core crct10dif_common [last unloaded: bnx2x]
CPU: 80 PID: 0 Comm: swapper/80 Not tainted 4.19.0-rc4-autotest-autotest #1
NIP: c00000000009d048 LR: c000000000092fd0 CTR: 0000000030032a00
REGS: c000003fff493d80 TRAP: 0900 Not tainted (4.19.0-rc4-autotest-autotest)
MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 48004042 XER: 00000000
CFAR: c000000000092fbc IRQMASK: 1
GPR00: 0000000030005128 c000003fff70f220 c0000000010ae500 0000000000000000
GPR04: 0000000048004042 c00000000009d048 9000000000009033 0000000000000090
GPR08: 0000000000000000 0000000000000000 c000000000092fe4 9000000000001003
GPR12: c000000000092fbc c000003fff7ff300 c000003c96c80c00 0000000000010000
GPR16: 0000000000000000 000000000000003c c000003c96c80800 c000003c96d00700
GPR20: 0000000000000001 0000000000000001 0000000000000002 0000000000000014
GPR24: c000001fe8741000 c000003fff70f330 0000000000000000 c000003ca947fb40
GPR28: 00000000092f47d0 0000000000000014 c000001fe8741000 c000001fe9860200
NIP [c00000000009d048] opal_interrupt+0x28/0x70
LR [c000000000092fd0] opal_return+0x14/0x48
Call Trace:
[c000003fff70f220] [c00000000009d048] opal_interrupt+0x28/0x70 (unreliable)
[c000003fff70f250] [c00000000016d890] __handle_irq_event_percpu+0x90/0x2d0
[c000003fff70f310] [c00000000016db00] handle_irq_event_percpu+0x30/0x90
[c000003fff70f350] [c00000000016dbc0] handle_irq_event+0x60/0xc0
[c000003fff70f380] [c000000000172d2c] handle_fasteoi_irq+0xbc/0x1f0
[c000003fff70f3b0] [c00000000016c084] generic_handle_irq+0x44/0x70
[c000003fff70f3d0] [c0000000000193cc] __do_irq+0x8c/0x200
[c000003fff70f440] [c000000000019640] do_IRQ+0x100/0x110
[c000003fff70f490] [c000000000008db8] hardware_interrupt_common+0x158/0x160
--- interrupt: 501 at fib_table_lookup+0xfc/0x600
LR = fib_validate_source+0x148/0x370
[c000003fff70f780] [0000000000000000] (null) (unreliable)
[c000003fff70f7e0] [c000000000959af8] fib_validate_source+0x148/0x370
[c000003fff70f8a0] [c0000000008fd664] ip_route_input_rcu+0x214/0x970
[c000003fff70f990] [c0000000008fdde0] ip_route_input_noref+0x20/0x30
[c000003fff70f9e0] [c000000000945e28] arp_process.constprop.14+0x3d8/0x8a0
[c000003fff70faf0] [c00000000089eb20] __netif_receive_skb_one_core+0x60/0x80
[c000003fff70fb30] [c0000000008a7d00] netif_receive_skb_internal+0x30/0x110
[c000003fff70fb70] [c0000000008a888c] napi_gro_receive+0x11c/0x1c0
[c000003fff70fbb0] [c000000000702afc] tg3_poll_work+0x5fc/0x1060
[c000003fff70fcb0] [c0000000007035b4] tg3_poll_msix+0x54/0x210
[c000003fff70fd00] [c0000000008a922c] net_rx_action+0x31c/0x470
[c000003fff70fe10] [c0000000009f5afc] __do_softirq+0x15c/0x3b4
[c000003fff70ff00] [c0000000000fddf0] irq_exit+0x100/0x120
[c000003fff70ff20] [c0000000000193d8] __do_irq+0x98/0x200
[c000003fff70ff90] [c00000000002af24] call_do_irq+0x14/0x24
[c000003ca947fa80] [c0000000000195d4] do_IRQ+0x94/0x110
[c000003ca947fad0] [c000000000008db8] hardware_interrupt_common+0x158/0x160
--- interrupt: 501 at replay_interrupt_return+0x0/0x4
LR = arch_local_irq_restore+0x84/0x90
[c000003ca947fdc0] [0000000000080000] 0x80000 (unreliable)
[c000003ca947fde0] [c000000000181f60] rcu_idle_exit+0xa0/0xd0
[c000003ca947fe30] [c000000000136d08] do_idle+0x1c8/0x3a0
[c000003ca947fec0] [c0000000001370b4] cpu_startup_entry+0x34/0x40
[c000003ca947fef0] [c0000000000467f4] start_secondary+0x4d4/0x520
[c000003ca947ff90] [c00000000000b270] start_secondary_prolog+0x10/0x14
Instruction dump:
60000000 60420000 3c4c0101 384214e0 7c0802a6 78630020 f8010010 f821ffd1
4bf7b901 60000000 38810020 4bff657d <60000000> 39010020 3d42ffed e94a5d28
watchdog: CPU 80 became unstuck TB:980802789270
CPU: 80 PID: 412 Comm: ksoftirqd/80 Not tainted 4.19.0-rc4-autotest-autotest #1
Call Trace:
[c000003ca96f7910] [c0000000009d4cec] dump_stack+0xb0/0xf4 (unreliable)
[c000003ca96f7950] [c00000000002f278] wd_smp_clear_cpu_pending+0x368/0x3f0
[c000003ca96f7a10] [c00000000002fa48] wd_timer_fn+0x78/0x3a0
[c000003ca96f7ad0] [c00000000018a3c0] call_timer_fn+0x50/0x1b0
[c000003ca96f7b50] [c00000000018a658] expire_timers+0x138/0x1e0
[c000003ca96f7bc0] [c00000000018a7c8] run_timer_softirq+0xc8/0x220
[c000003ca96f7c50] [c0000000009f5afc] __do_softirq+0x15c/0x3b4
[c000003ca96f7d40] [c0000000000fdab4] run_ksoftirqd+0x54/0x80
[c000003ca96f7d60] [c000000000126f10] smpboot_thread_fn+0x290/0x2a0
[c000003ca96f7dc0] [c0000000001215ac] kthread+0x15c/0x1a0
[c000003ca96f7e30] [c00000000000bdd4] ret_from_kernel_thread+0x5c/0x68
bnx2x 0008:01:00.2: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
bnx2x 0008:01:00.2 enP8p1s0f2: renamed from eth0
bnx2x 0008:01:00.3: msix capability found
bnx2x 0008:01:00.3: Using 64-bit DMA iommu bypass
bnx2x 0008:01:00.3: part number 0-0-0-0
bnx2x 0008:01:00.3: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
bnx2x 0008:01:00.3 enP8p1s0f3: renamed from eth0
--
Regard's
Abdul Haleem
IBM Linux Technology Centre