[PATCH] [v2] nvme: replace meaningless judgement by checking whether req is null
From: Xianting Tian
Date: Tue Sep 22 2020 - 02:33:29 EST
Currently, we use nvmeq->q_depth as the upper limit for a valid tag in
nvme_handle_cqe(), it is not correct. Because the available tag number
is recorded in tagset, which is not equal to nvmeq->q_depth.
The nvme driver registers interrupts for queues before initializing the
tagset, because it uses the number of successful request_irq() calls to
configure the tagset parameters. This allows a race condition with the
current tag validity check if the controller happens to produce an
interrupt with a corrupted CQE before the tagset is initialized.
Replace the driver's indirect tag check with the one already provided by
the block layer. With this patch, we can avoid a null pointer deference
as below.
[ 1124.256246] nvme nvme5: pci function 0000:e1:00.0
[ 1124.256323] nvme 0000:e1:00.0: enabling device (0000 -> 0002)
[ 1125.720859] nvme nvme5: 96/0/0 default/read/poll queues
[ 1125.732483] nvme5n1: p1 p2 p3
[ 1125.788049] BUG: unable to handle kernel NULL pointer dereference at 0000000000000130
[ 1125.788054] PGD 0 P4D 0
[ 1125.788057] Oops: 0002 [#1] SMP NOPTI
[ 1125.788059] CPU: 50 PID: 0 Comm: swapper/50 Kdump: loaded Tainted: G
------- -t - 4.18.0-147.el8.x86_64 #1
[ 1125.788065] RIP: 0010:nvme_irq+0xe8/0x240 [nvme]
[ 1125.788068] RSP: 0018:ffff916b8ec83ed0 EFLAGS: 00010813
[ 1125.788069] RAX: 0000000000000000 RBX: ffff918ae9211b00 RCX: 0000000000000000
[ 1125.788070] RDX: 000000000000400b RSI: 0000000000000000 RDI: 0000000000000000
[ 1125.788071] RBP: ffff918ae8870000 R08: 0000000000000004 R09: ffff918ae8870000
[ 1125.788072] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1125.788073] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001
[ 1125.788075] FS: 0000000000000000(0000) GS:ffff916b8ec80000(0000) knlGS:0000000000000000
[ 1125.788075] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1125.788076] CR2: 0000000000000130 CR3: 0000001768f00000 CR4: 0000000000340ee0
[ 1125.788077] Call Trace:
[ 1125.788080] <IRQ>
[ 1125.788085] __handle_irq_event_percpu+0x40/0x180
[ 1125.788087] handle_irq_event_percpu+0x30/0x80
[ 1125.788089] handle_irq_event+0x36/0x53
[ 1125.788090] handle_edge_irq+0x82/0x190
[ 1125.788094] handle_irq+0xbf/0x100
[ 1125.788098] do_IRQ+0x49/0xd0
[ 1125.788100] common_interrupt+0xf/0xf
Signed-off-by: Xianting Tian <tian.xianting@xxxxxxx>
---
drivers/nvme/host/pci.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 899d2f4d7..f7cf01fc7 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -940,13 +940,6 @@ static inline void nvme_handle_cqe(struct nvme_queue *nvmeq, u16 idx)
struct nvme_completion *cqe = &nvmeq->cqes[idx];
struct request *req;
- if (unlikely(cqe->command_id >= nvmeq->q_depth)) {
- dev_warn(nvmeq->dev->ctrl.device,
- "invalid id %d completed on queue %d\n",
- cqe->command_id, le16_to_cpu(cqe->sq_id));
- return;
- }
-
/*
* AEN requests are special as they don't time out and can
* survive any kind of queue freeze and often don't respond to
@@ -960,6 +953,13 @@ static inline void nvme_handle_cqe(struct nvme_queue *nvmeq, u16 idx)
}
req = blk_mq_tag_to_rq(nvme_queue_tagset(nvmeq), cqe->command_id);
+ if (unlikely(!req)) {
+ dev_warn(nvmeq->dev->ctrl.device,
+ "invalid id %d completed on queue %d\n",
+ cqe->command_id, le16_to_cpu(cqe->sq_id));
+ return;
+ }
+
trace_nvme_sq(req, cqe->sq_head, nvmeq->sq_tail);
if (!nvme_try_complete_req(req, cqe->status, cqe->result))
nvme_pci_complete_rq(req);
--
2.17.1