WARNING: possible irq lock inversion dependency detected, at: close_transaction+0x48/0x230 [firewire_core] but this lock took another, SOFTIRQ-unsafe lock in the past: (fs_reclaim){+.+.}-{0:0} (kernel 6.5-rc6)

From: Erhard Furtner
Date: Sun Aug 20 2023 - 07:13:52 EST


Getting this at every boot on my PowerMac G5 11,2 (kernel 6.5-rc6) when CONFIG_PROVE_LOCKING=y is set:

[...]
========================================================
WARNING: possible irq lock inversion dependency detected
6.5.0-rc6-PMacG5-dirty #2 Tainted: G T
--------------------------------------------------------
kworker/1:1/60 just changed the state of lock:
c000000022802108 (&card->lock){..-.}-{2:2}, at: close_transaction+0x48/0x230 [firewire_core]
but this lock took another, SOFTIRQ-unsafe lock in the past:
(fs_reclaim){+.+.}-{0:0}


and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
Possible interrupt unsafe locking scenario:

CPU0 CPU1
---- ----
lock(fs_reclaim);
local_irq_disable();
lock(&card->lock);
lock(fs_reclaim);
<Interrupt>
lock(&card->lock);

*** DEADLOCK ***

2 locks held by kworker/1:1/60:
#0: c000000020316538 ((wq_completion)firewire){+.+.}-{0:0}, at: process_one_work+0x26c/0x720
#1: c0000000037dbea8 ((work_completion)(&(&device->work)->work)){+.+.}-{0:0}, at: process_one_work+0x26c/0x720

the shortest dependencies between 2nd lock and 1st lock:
-> (fs_reclaim){+.+.}-{0:0} {
HARDIRQ-ON-W at:
lock_acquire+0x124/0x3e0
fs_reclaim_acquire+0x104/0x140
__kmem_cache_alloc_node+0x7c/0x570
kmalloc_trace+0x50/0x1a0
pmf_register_driver+0x120/0x530
pmac_pfunc_base_install+0x210/0x684
smp_core99_probe+0x1a4/0x944
smp_prepare_cpus+0x5b0/0x664
kernel_init_freeable+0x398/0x6d8
kernel_init+0x28/0x1a0
ret_from_kernel_user_thread+0x14/0x1c
SOFTIRQ-ON-W at:
lock_acquire+0x124/0x3e0
fs_reclaim_acquire+0x104/0x140
__kmem_cache_alloc_node+0x7c/0x570
kmalloc_trace+0x50/0x1a0
pmf_register_driver+0x120/0x530
pmac_pfunc_base_install+0x210/0x684
smp_core99_probe+0x1a4/0x944
smp_prepare_cpus+0x5b0/0x664
kernel_init_freeable+0x398/0x6d8
kernel_init+0x28/0x1a0
ret_from_kernel_user_thread+0x14/0x1c
INITIAL USE at:
lock_acquire+0x124/0x3e0
fs_reclaim_acquire+0x104/0x140
__kmem_cache_alloc_node+0x7c/0x570
kmalloc_trace+0x50/0x1a0
pmf_register_driver+0x120/0x530
pmac_pfunc_base_install+0x210/0x684
smp_core99_probe+0x1a4/0x944
smp_prepare_cpus+0x5b0/0x664
kernel_init_freeable+0x398/0x6d8
kernel_init+0x28/0x1a0
ret_from_kernel_user_thread+0x14/0x1c
}
... key at: [<c00000000234acb8>] __fs_reclaim_map+0x0/0x28
... acquired at:
lock_acquire+0x124/0x3e0
fs_reclaim_acquire+0x104/0x140
__kmem_cache_alloc_node+0x7c/0x570
__kmalloc+0x64/0x320
fw_core_handle_bus_reset+0x740/0xcb0 [firewire_core]
bus_reset_work+0x350/0x9f0 [firewire_ohci]
process_one_work+0x320/0x720
worker_thread+0x98/0x500
kthread+0x148/0x150
start_kernel_thread+0x14/0x18

-> (&card->lock){..-.}-{2:2} {
IN-SOFTIRQ-W at:
lock_acquire+0x124/0x3e0
_raw_spin_lock_irqsave+0x64/0xd0
close_transaction+0x48/0x230 [firewire_core]
handle_at_packet+0x158/0x260 [firewire_ohci]
context_tasklet+0x194/0x200 [firewire_ohci]
tasklet_action_common.constprop.0+0x2ac/0x5e0
__do_softirq+0x180/0x5bc
do_softirq_own_stack+0x38/0x80
__irq_exit_rcu+0x168/0x1c0
irq_exit+0x18/0x60
arch_local_irq_restore+0x13c/0x3a0
_raw_spin_unlock_irq+0x48/0x90
wq_worker_sleeping+0xfc/0x130
schedule+0xf0/0x190
schedule_timeout+0x234/0x2b0
__wait_for_common+0x100/0x2d0
fw_run_transaction+0x16c/0x1e0 [firewire_core]
read_rom+0x8c/0x110 [firewire_core]
read_config_rom+0xa4/0x560 [firewire_core]
fw_device_init+0x38/0x430 [firewire_core]
fw_device_workfn+0x24/0x50 [firewire_core]
process_one_work+0x320/0x720
worker_thread+0x98/0x500
kthread+0x148/0x150
start_kernel_thread+0x14/0x18
INITIAL USE at:
lock_acquire+0x124/0x3e0
_raw_spin_lock_irqsave+0x64/0xd0
fw_core_handle_bus_reset+0x7c/0xcb0 [firewire_core]
bus_reset_work+0x350/0x9f0 [firewire_ohci]
process_one_work+0x320/0x720
worker_thread+0x98/0x500
kthread+0x148/0x150
start_kernel_thread+0x14/0x18
}
... key at: [<c0003d0000360590>] __key.5+0x0/0x10 [firewire_core]
... acquired at:
__lock_acquire+0xa54/0x1f70
lock_acquire+0x124/0x3e0
_raw_spin_lock_irqsave+0x64/0xd0
close_transaction+0x48/0x230 [firewire_core]
handle_at_packet+0x158/0x260 [firewire_ohci]
context_tasklet+0x194/0x200 [firewire_ohci]
tasklet_action_common.constprop.0+0x2ac/0x5e0
__do_softirq+0x180/0x5bc
do_softirq_own_stack+0x38/0x80
__irq_exit_rcu+0x168/0x1c0
irq_exit+0x18/0x60
arch_local_irq_restore+0x13c/0x3a0
_raw_spin_unlock_irq+0x48/0x90
wq_worker_sleeping+0xfc/0x130
schedule+0xf0/0x190
schedule_timeout+0x234/0x2b0
__wait_for_common+0x100/0x2d0
fw_run_transaction+0x16c/0x1e0 [firewire_core]
read_rom+0x8c/0x110 [firewire_core]
read_config_rom+0xa4/0x560 [firewire_core]
fw_device_init+0x38/0x430 [firewire_core]
fw_device_workfn+0x24/0x50 [firewire_core]
process_one_work+0x320/0x720
worker_thread+0x98/0x500
kthread+0x148/0x150
start_kernel_thread+0x14/0x18


stack backtrace:
CPU: 1 PID: 60 Comm: kworker/1:1 Tainted: G T 6.5.0-rc6-PMacG5-dirty #2
Hardware name: PowerMac11,2 PPC970MP 0x440101 PowerMac
Workqueue: firewire fw_device_workfn [firewire_core]
Call Trace:
[c00000000fff38a0] [c000000000ff342c] dump_stack_lvl+0xb0/0x11c (unreliable)
[c00000000fff38e0] [c00000000018f65c] print_irq_inversion_bug.part.0+0x220/0x27c
[c00000000fff3970] [c000000000188288] mark_lock+0xbe8/0x1620
[c00000000fff3ab0] [c000000000189af4] __lock_acquire+0xa54/0x1f70
[c00000000fff3be0] [c00000000018bdc4] lock_acquire+0x124/0x3e0
[c00000000fff3ce0] [c000000001042024] _raw_spin_lock_irqsave+0x64/0xd0
[c00000000fff3d20] [c0003d00004076e0] close_transaction+0x48/0x230 [firewire_core]
[c00000000fff3d90] [c0003d00004f99b0] handle_at_packet+0x158/0x260 [firewire_ohci]
[c00000000fff3de0] [c0003d00004fcf9c] context_tasklet+0x194/0x200 [firewire_ohci]
[c00000000fff3e40] [c0000000000e363c] tasklet_action_common.constprop.0+0x2ac/0x5e0
[c00000000fff3ec0] [c000000001043750] __do_softirq+0x180/0x5bc
[c00000000fff3fe0] [c000000000014008] do_softirq_own_stack+0x38/0x80
[c0000000037db6d0] [c0000000000e21e8] __irq_exit_rcu+0x168/0x1c0
[c0000000037db700] [c0000000000e3048] irq_exit+0x18/0x60
[c0000000037db720] [c0000000000304ec] arch_local_irq_restore+0x13c/0x3a0
[c0000000037db7b0] [c0000000010424c8] _raw_spin_unlock_irq+0x48/0x90
[c0000000037db7e0] [c000000000113e0c] wq_worker_sleeping+0xfc/0x130
[c0000000037db810] [c000000001035110] schedule+0xf0/0x190
[c0000000037db880] [c0000000010413a4] schedule_timeout+0x234/0x2b0
[c0000000037db960] [c000000001035e90] __wait_for_common+0x100/0x2d0
[c0000000037db9f0] [c0003d0000406dd4] fw_run_transaction+0x16c/0x1e0 [firewire_core]
[c0000000037dbc40] [c0003d0000401fe4] read_rom+0x8c/0x110 [firewire_core]
[c0000000037dbc90] [c0003d00004030dc] read_config_rom+0xa4/0x560 [firewire_core]
[c0000000037dbd70] [c0003d0000403900] fw_device_init+0x38/0x430 [firewire_core]
[c0000000037dbe20] [c0003d000040195c] fw_device_workfn+0x24/0x50 [firewire_core]
[c0000000037dbe40] [c0000000001100b0] process_one_work+0x320/0x720
[c0000000037dbf10] [c000000000110548] worker_thread+0x98/0x500
[c0000000037dbf90] [c00000000011f938] kthread+0x148/0x150
[c0000000037dbfe0] [c00000000000bfb0] start_kernel_thread+0x14/0x18
firewire_core 0001:03:0e.0: created device fw0: GUID 001451fffe26fa7e, S800
firewire_core 0001:03:0e.0: phy config: new root=ffc1, gap_count=5
[...]

Some data on the machine:
# lspci
0000:00:0b.0 PCI bridge: Apple Inc. CPC945 PCIe Bridge
0000:0a:00.0 VGA compatible controller: NVIDIA Corporation NV43 [GeForce 6600 LE] (rev a2)
0001:00:00.0 Host bridge: Apple Inc. U4 HT Bridge
0001:00:01.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-X bridge (rev a3)
0001:00:02.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-X bridge (rev a3)
0001:00:03.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3)
0001:00:04.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3)
0001:00:05.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3)
0001:00:06.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3)
0001:00:07.0 PCI bridge: Apple Inc. Shasta PCI Bridge
0001:00:08.0 PCI bridge: Apple Inc. Shasta PCI Bridge
0001:00:09.0 PCI bridge: Apple Inc. Shasta PCI Bridge
0001:01:07.0 Unassigned class [ff00]: Apple Inc. Shasta Mac I/O
0001:01:0b.0 USB controller: NEC Corporation OHCI USB Controller (rev 43)
0001:01:0b.1 USB controller: NEC Corporation OHCI USB Controller (rev 43)
0001:01:0b.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 04)
0001:03:0c.0 IDE interface: Broadcom K2 SATA
0001:03:0d.0 Unassigned class [ff00]: Apple Inc. Shasta IDE
0001:03:0e.0 FireWire (IEEE 1394): Apple Inc. Shasta Firewire
0001:05:04.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5780 Gigabit Ethernet (rev 03)
0001:05:04.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5780 Gigabit Ethernet (rev 03)
0001:06:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV516 [Radeon X1300/X1550 Series]
0001:06:00.1 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] RV516 [Radeon X1300/X1550 Series] (Secondary)
0001:08:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
0001:09:00.0 Non-Volatile memory controller: Sandisk Corp WD Black SN750 / PC SN730 NVMe SSD

# lspci -vv -s 0001:03:0e.0
0001:03:0e.0 FireWire (IEEE 1394): Apple Inc. Shasta Firewire (prog-if 10 [OHCI])
Subsystem: Apple Inc. Shasta Firewire
Device tree node: /sys/firmware/devicetree/base/ht@0,f2000000/pci@9/firewire@e
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 248 (3000ns min, 6000ns max), Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 39
Region 0: Memory at fa400000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [44] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME+
Kernel driver in use: firewire_ohci
Kernel modules: firewire_ohci


Full dmesg + kernel .config attached.

Regards,
Erhard

Attachment: dmesg_65-rc6_g5_05
Description: Binary data

Attachment: config_65-rc6_g5
Description: Binary data