Re: [PATCH v3 0/2] PCI/IOV: Fix deadlock when removing PF with enabled SR-IOV

From: Bjorn Helgaas

Date: Wed Feb 25 2026 - 13:32:57 EST


On Wed, Feb 25, 2026 at 03:59:49PM +0100, Dragos Tatulea wrote:
> On 23.02.26 19:34, Dragos Tatulea wrote:
> > On 23.02.26 18:33, Benjamin Block wrote:
> >> On Mon, Feb 23, 2026 at 03:10:35PM +0100, Dragos Tatulea wrote:
> >>> After pulling in these commits in our internal tree we can see the
> >>> lockdep splat from below in many internal tests. We are still trying to
> >>> find an easy repro for this. We had to internally revert both of them.
> >>>
> >>> I noticed some similar discussion in another thread [1] but there it
> >>> seems that these changes are actually fixing the issue which is not
> >>> the case for us.
> >>>
> >>> ------------[ cut here ]------------
> >>> WARNING: drivers/pci/remove.c:130 at pci_stop_and_remove_bus_device+0x39/0x40, CPU#2: modprobe/12956
> >>> Modules linked in: mlx5_core(-) act_tunnel_key vxlan dummy act_mirred act_gact cls_flower act_police act_ct nf_flow_table [...]
> >>> CPU: 2 UID: 0 PID: 12956 Comm: modprobe Not tainted 6.19.0net_next_e834b5e #1 PREEMPT
> >>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> >>> RIP: 0010:pci_stop_and_remove_bus_device+0x39/0x40
> >>> Code: [...]
> >>> RSP: 0018:ffff888164c9fd10 EFLAGS: 00010246
> >>> RAX: 0000000000000000 RBX: ffff888188ff2000 RCX: 0000000000000001
> >>> RDX: 0000000000000046 RSI: ffffffff8307e068 RDI: ffff88816bf4c9c0
> >>> RBP: ffff888188ff2000 R08: 00000000000000f4 R09: ffff88816bf4c080
> >>> R10: 0000000000000001 R11: 0000000000000003 R12: 0000000000000000
> >>> R13: ffff888164c9fd27 R14: 0000000000000002 R15: 0000000000000000
> >>> FS: 00007f52364bd740(0000) GS:ffff8885a9019000(0000) knlGS:0000000000000000
> >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> CR2: 00005622dbf749d8 CR3: 0000000169132004 CR4: 0000000000372eb0
> >>> Call Trace:
> >>> <TASK>
> >>> pci_iov_remove_virtfn+0xbd/0x120
> >>> sriov_disable+0x30/0xe0
> >>> mlx5_sriov_disable+0x50/0xa0 [mlx5_core]
> >>> remove_one+0x68/0xe0 [mlx5_core]
> >>> pci_device_remove+0x39/0xa0
> >>> device_release_driver_internal+0x1e4/0x240
> >>> driver_detach+0x47/0x90
> >>> bus_remove_driver+0x84/0x110
> >>> pci_unregister_driver+0x3b/0x90
> >>
> >> This looks pretty much like what Ionut is trying to fix in
> >> v1: https://lore.kernel.org/linux-pci/20260214193235.262219-3-ionut.nechita@xxxxxxxxxxxxx/T/
> >> v2: https://lore.kernel.org/linux-pci/20260219212648.82606-1-ionut.nechita@xxxxxxxxxxxxx/T/
> >>
> >> Maybe try giving those patches a spin. I think one easy way to hit this sort
> >> of thing is to try unbinding a PF that has 1 or more VFs attached to it from
> >> some device driver. The "trick" is that SR-IOV has to be active.
> > Thanks or the pointer. Will try it.
> >
> Took the v2 and it did the trick. Thanks!
> Is it worth a Tested-by tag from me?

Definitely, I always like to include a Tested-by, both to acknowledge
your effort in testing and to able to include you if we trip over
similar issues in the future.

If you respond to the patch you tested and include "Tested-by", the
tools will pick it up automatically.