Re: [PATCH] IB/core: Fix ABBA deadlock in rdma_dev_exit_net
From: Michael Gur
Date: Tue Dec 16 2025 - 09:02:19 EST
On 12/16/2025 11:59 AM, wujing wrote:
Hi Jason,
You're right that the locks aren't nested in rdma_dev_exit_net() - it does release
rdma_nets_rwsem before acquiring devices_rwsem. However, this is still an ABBA deadlock,
just not the trivial nested kind. The issue is caused by **rwsem writer priority**
and lock ordering inconsistency.
Here's the actual deadlock scenario:
**Thread A (rdma_dev_exit_net - cleanup_net workqueue):**
```
down_write(&rdma_nets_rwsem); // Acquired
xa_store(&rdma_nets, ...);
up_write(&rdma_nets_rwsem); // Released
down_read(&devices_rwsem); // Waiting here <-- BLOCKED
```
**Thread B (rdma_dev_init_net - stress-ng-clone):**
```
down_read(&devices_rwsem); // Acquired
down_read(&rdma_nets_rwsem); // Waiting here <-- BLOCKED
```
The deadlock happens because:
1. Thread A releases rdma_nets_rwsem as a **writer**
2. Thread B (and many others) are waiting to acquire rdma_nets_rwsem as **readers**
3. Thread A then tries to acquire devices_rwsem as a reader
4. BUT: rwsem gives priority to pending writers over new readers
5. Since Thread A was a pending writer on rdma_nets_rwsem, Thread B's read request is blocked
6. Thread B holds devices_rwsem, which Thread A needs
7. Thread A holds the "writer priority slot" on rdma_nets_rwsem, which Thread B needs
Why would Thread A still hold any writer priority after calling up_write()?
The kernel log is also not consistent with this analysis, the thread
running rdma_dev_exit_net() is stuck on the down_write(), not on the
down_read().
Maybe what we have is a thread running some net namespace operation
while holding rdma_nets_rwsem and starving all other threads.
I'm not sure how many devices and namespaces we need to have so that we
get it to block for this long, but I'd assume it's possible when running
stress testing.