[PATCH V3] rdma_rxe: call comp_handler without holding cq->cq_lock

From: Philipp Reisner
Date: Thu Sep 25 2025 - 15:57:28 EST


Allow the comp_handler callback implementation to call ib_poll_cq().
A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.

The Mellanox and Intel drivers allow a comp_handler callback
implementation to call ib_poll_cq().

Avoid the deadlock by calling the comp_handler callback without
holding cq->cq_lock.

Other InfiniBand drivers call the comp_handler callback from a single
thread, in the RXE driver, acquiring the cq->cq_lock has achieved that
up to now. As that gets removed, introduce a new lock dedicated to
making the execution of the comp_handler single-threaded.

Changelog:
v2 -> v3:
- make execution of comp_handler single-threaded

v2: https://lore.kernel.org/lkml/20250822081941.989520-1-philipp.reisner@xxxxxxxxxx/

v1 -> v2:
- Only reset cq->notify to 0 when invoking the comp_handler

v1: https://lore.kernel.org/all/20250806123921.633410-1-philipp.reisner@xxxxxxxxxx/
====================

Signed-off-by: Philipp Reisner <philipp.reisner@xxxxxxxxxx>
Reviewed-by: Zhu Yanjun <yanjun.zhu@xxxxxxxxx>
---
drivers/infiniband/sw/rxe/rxe_cq.c | 10 +++++++++-
drivers/infiniband/sw/rxe/rxe_verbs.h | 1 +
2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
index fffd144d509e..8d94cef7bd50 100644
--- a/drivers/infiniband/sw/rxe/rxe_cq.c
+++ b/drivers/infiniband/sw/rxe/rxe_cq.c
@@ -62,6 +62,7 @@ int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe,
cq->is_user = uresp;

spin_lock_init(&cq->cq_lock);
+ spin_lock_init(&cq->comp_handler_lock);
cq->ibcq.cqe = cqe;
return 0;
}
@@ -88,6 +89,7 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
int full;
void *addr;
unsigned long flags;
+ bool invoke_handler = false;

spin_lock_irqsave(&cq->cq_lock, flags);

@@ -113,11 +115,17 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
if ((cq->notify & IB_CQ_NEXT_COMP) ||
(cq->notify & IB_CQ_SOLICITED && solicited)) {
cq->notify = 0;
- cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
+ invoke_handler = true;
}

spin_unlock_irqrestore(&cq->cq_lock, flags);

+ if (invoke_handler) {
+ spin_lock_irqsave(&cq->comp_handler_lock, flags);
+ cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
+ spin_unlock_irqrestore(&cq->comp_handler_lock, flags);
+ }
+
return 0;
}

diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index fd48075810dd..04ec60a786f8 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -62,6 +62,7 @@ struct rxe_cq {
struct rxe_pool_elem elem;
struct rxe_queue *queue;
spinlock_t cq_lock;
+ spinlock_t comp_handler_lock;
u8 notify;
bool is_user;
atomic_t num_wq;
--
2.50.1