[PATCH] RDMA/bng_re: return a timeout when firmware responses stall
From: Pengpeng Hou
Date: Wed Jun 24 2026 - 20:38:02 EST
__wait_for_resp() documents that it returns a non-zero error when a
firmware command does not complete, and bng_re_rcfw_send_message() already
marks the firmware as stalled when the helper returns -ENODEV.
However, the helper ignores wait_event_timeout() expiry. If the response
slot remains in use after the timeout and after the polled CREQ service
attempt, the loop starts another full timeout period and can repeat
forever.
Return -ENODEV after a timed out wait that still has no response. The
existing caller then marks FIRMWARE_STALL_DETECTED and returns
-ETIMEDOUT to the command issuer.
Signed-off-by: Pengpeng Hou <pengpeng@xxxxxxxxxxx>
---
drivers/infiniband/hw/bng_re/bng_fw.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/bng_re/bng_fw.c b/drivers/infiniband/hw/bng_re/bng_fw.c
index 50156c300..ab6a2d2e9 100644
--- a/drivers/infiniband/hw/bng_re/bng_fw.c
+++ b/drivers/infiniband/hw/bng_re/bng_fw.c
@@ -401,14 +401,15 @@ static int __wait_for_resp(struct bng_re_rcfw *rcfw, u16 cookie)
{
struct bng_re_cmdq_ctx *cmdq;
struct bng_re_crsqe *crsqe;
+ unsigned long time_left;
cmdq = &rcfw->cmdq;
crsqe = &rcfw->crsqe_tbl[cookie];
do {
- wait_event_timeout(cmdq->waitq,
- !crsqe->is_in_used,
- secs_to_jiffies(rcfw->max_timeout));
+ time_left = wait_event_timeout(cmdq->waitq,
+ !crsqe->is_in_used,
+ secs_to_jiffies(rcfw->max_timeout));
if (!crsqe->is_in_used)
return 0;
@@ -417,6 +418,9 @@ static int __wait_for_resp(struct bng_re_rcfw *rcfw, u16 cookie)
if (!crsqe->is_in_used)
return 0;
+
+ if (!time_left)
+ return -ENODEV;
} while (true);
};
--
2.50.1 (Apple Git-155)