Re: [PATCH] IB/core: Fix ABBA deadlock in rdma_dev_exit_net

From: Jason Gunthorpe

Date: Mon Dec 15 2025 - 19:57:07 EST


On Thu, Dec 11, 2025 at 04:08:13PM +0800, wujing wrote:
> Classic ABBA deadlock due to inconsistent lock ordering between
> rdma_dev_exit_net() and rdma_dev_init_net():
>
> Thread A (cleanup_net workqueue -> kworker/u256:1):
> rdma_dev_exit_net():
> down_write(&rdma_nets_rwsem) <- held at line rdma_dev_exit_net+0x60
> down_read(&devices_rwsem) <- waiting (shown in rwsem_down_write_slowpath)

This isn't right, it unlocked the &rdma_nets_rwsem:

down_write(&rdma_nets_rwsem);
/*
* Prevent the ID from being re-used and hide the id from xa_for_each.
*/
ret = xa_err(xa_store(&rdma_nets, rnet->id, NULL, GFP_KERNEL));
WARN_ON(ret);
up_write(&rdma_nets_rwsem); <------

down_read(&devices_rwsem);

It is not nested and there is not a dependency.

> Thread B (stress-ng-clone processes):
> rdma_dev_init_net():
> down_read(&devices_rwsem) <- held at line rdma_dev_init_net+0x120
> down_read(&rdma_nets_rwsem) <- waiting (blocked by pending writer from Thread A)

This one is nested though.

I don't know what your bug is, but it is not some trivial ABBA
deadlock, lockdep would have found something like that ages ago.

Jason