Re: Re: nvme-tcp: fix a possible UAF when failing to send request
From: zhang.guanghui@xxxxxxxx
Date: Wed Feb 12 2025 - 04:48:10 EST
Hi, Thanks.
I will test this patch, but I am worried whether it will affect the performance.
Should we also consider null pointer protection?
zhang.guanghui@xxxxxxxx
From: Maurizio Lombardi
Date: 2025-02-12 16:52
To: Maurizio Lombardi; zhang.guanghui@xxxxxxxx; chunguang.xu
CC: mgurtovoy; sagi; kbusch; sashal; linux-kernel; linux-nvme; linux-block
Subject: Re: nvme-tcp: fix a possible UAF when failing to send request
On Wed Feb 12, 2025 at 9:11 AM CET, Maurizio Lombardi wrote:
> On Tue Feb 11, 2025 at 9:04 AM CET, zhang.guanghui@xxxxxxxx wrote:
>> Hi
>>
>> This is a race issue, I can't reproduce it stably yet. I have not tested the latest kernel. but in fact, I've synced some nvme-tcp patches from lastest upstream,
>
> Hello, could you try this patch?
>
> queue_lock should protect against concurrent "error recovery",
> + mutex_lock(&queue->queue_lock);
Unfortunately I've just realized that queue_lock won't save us
from the race against the controller reset, it's still possible
we lock a destroyed mutex. So just try this
simplified patch, I will try to figure out something else:
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 841238f38fdd..b714e1691c30 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2660,7 +2660,10 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob)
set_bit(NVME_TCP_Q_POLLING, &queue->flags);
if (sk_can_busy_loop(sk) && skb_queue_empty_lockless(&sk->sk_receive_queue))
sk_busy_loop(sk, true);
+
+ mutex_lock(&queue->send_mutex);
nvme_tcp_try_recv(queue);
+ mutex_unlock(&queue->send_mutex);
clear_bit(NVME_TCP_Q_POLLING, &queue->flags);
return queue->nr_cqe;
}
Maurizio