Hello,
On Sun, 2 Oct 2022 10:12:15 -0700 James Smart wrote:
On 10/1/2022 5:19 PM, Duoming Zhou wrote:
The function lpfc_poll_timeout() is a timer handler that runs in an
atomic context, but it calls "kzalloc(.., GFP_KERNEL)" that may sleep.
As a result, the sleep-in-atomic-context bug will happen. The processes
is shown below:
lpfc_poll_timeout()
lpfc_sli_handle_fast_ring_event()
lpfc_sli_process_unsol_iocb()
lpfc_complete_unsol_iocb()
lpfc_nvme_unsol_ls_handler()
lpfc_nvme_handle_lsreq()
nvme_fc_rcv_ls_req()
kzalloc(sizeof(.., GFP_KERNEL) //may sleep
This patch changes the gfp_t parameter of kzalloc() from GFP_KERNEL to
GFP_ATOMIC in order to mitigate the bug.
Fixes: 14fd1e98afaf ("nvme-fc: Add Disconnect Association Rcv support")
Signed-off-by: Duoming Zhou <duoming@xxxxxxxxxx>
---
drivers/nvme/host/fc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 127abaf9ba5..36698dfc8b3 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -1754,7 +1754,7 @@ nvme_fc_rcv_ls_req(struct nvme_fc_remote_port *portptr,
lsop = kzalloc(sizeof(*lsop) +
sizeof(union nvmefc_ls_requests) +
sizeof(union nvmefc_ls_responses),
- GFP_KERNEL);
+ GFP_ATOMIC);
if (!lsop) {
dev_info(lport->dev,
"RCV %s LS failed: No memory\n",
I would prefer this was fixed within lpfc rather than introducing atomic
allocations (1st in either host or target transport). It was introduced
by lpfc change in irq handling style.
Thank your for your reply and suggestions!
Do you think change the lpfc_poll_timeout() to a delayed_work is better?
Best regards,
Duoming Zhou