Re: [PATCH] xprtrdma: Decrement re_receiving on the early exit paths

From: Chuck Lever

Date: Mon Feb 23 2026 - 15:03:51 EST


On 2/23/26 1:28 PM, Eric Badger wrote:
> In the event that rpcrdma_post_recvs() fails to create a work request
> (due to memory allocation failure, say) or otherwise exits early, we
> should decrement ep->re_receiving before returning. Otherwise we will
> hang in rpcrdma_xprt_drain() as re_receiving will never reach zero and
> the completion will never be triggered.
>
> On a system with high memory pressure, this can appear as the following
> hung task:
>
> INFO: task kworker/u385:17:8393 blocked for more than 122 seconds.
> Tainted: G S E 6.19.0 #3
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:kworker/u385:17 state:D stack:0 pid:8393 tgid:8393 ppid:2 task_flags:0x4248060 flags:0x00080000
> Workqueue: xprtiod xprt_autoclose [sunrpc]
> Call Trace:
> <TASK>
> __schedule+0x48b/0x18b0
> ? ib_post_send_mad+0x247/0xae0 [ib_core]
> schedule+0x27/0xf0
> schedule_timeout+0x104/0x110
> __wait_for_common+0x98/0x180
> ? __pfx_schedule_timeout+0x10/0x10
> wait_for_completion+0x24/0x40
> rpcrdma_xprt_disconnect+0x444/0x460 [rpcrdma]
> xprt_rdma_close+0x12/0x40 [rpcrdma]
> xprt_autoclose+0x5f/0x120 [sunrpc]
> process_one_work+0x191/0x3e0
> worker_thread+0x2e3/0x420
> ? __pfx_worker_thread+0x10/0x10
> kthread+0x10d/0x230
> ? __pfx_kthread+0x10/0x10
> ret_from_fork+0x273/0x2b0
> ? __pfx_kthread+0x10/0x10
> ret_from_fork_asm+0x1a/0x30
>
> Fixes: 15788d1d1077 ("xprtrdma: Do not refresh Receive Queue while it is draining")
> Signed-off-by: Eric Badger <ebadger@xxxxxxxxxxxxxxx>

Reviewed-by: Chuck Lever <chuck.lever@xxxxxxxxxx>


--
Chuck Lever