[PATCH 4.14 10/41] scsi: ibmvscsi: Fix empty event pool access during host removal

From: Greg Kroah-Hartman
Date: Tue Mar 26 2019 - 02:46:39 EST


4.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tyrel Datwyler <tyreld@xxxxxxxxxxxxxxxxxx>

commit 7f5203c13ba8a7b7f9f6ecfe5a4d5567188d7835 upstream.

The event pool used for queueing commands is destroyed fairly early in the
ibmvscsi_remove() code path. Since, this happens prior to the call so
scsi_remove_host() it is possible for further calls to queuecommand to be
processed which manifest as a panic due to a NULL pointer dereference as
seen here:

PANIC: "Unable to handle kernel paging request for data at address
0x00000000"

Context process backtrace:

DSISR: 0000000042000000 ????Syscall Result: 0000000000000000
4 [c000000002cb3820] memcpy_power7 at c000000000064204
[Link Register] [c000000002cb3820] ibmvscsi_send_srp_event at d000000003ed14a4
5 [c000000002cb3920] ibmvscsi_send_srp_event at d000000003ed14a4 [ibmvscsi] ?(unreliable)
6 [c000000002cb39c0] ibmvscsi_queuecommand at d000000003ed2388 [ibmvscsi]
7 [c000000002cb3a70] scsi_dispatch_cmd at d00000000395c2d8 [scsi_mod]
8 [c000000002cb3af0] scsi_request_fn at d00000000395ef88 [scsi_mod]
9 [c000000002cb3be0] __blk_run_queue at c000000000429860
10 [c000000002cb3c10] blk_delay_work at c00000000042a0ec
11 [c000000002cb3c40] process_one_work at c0000000000dac30
12 [c000000002cb3cd0] worker_thread at c0000000000db110
13 [c000000002cb3d80] kthread at c0000000000e3378
14 [c000000002cb3e30] ret_from_kernel_thread at c00000000000982c

The kernel buffer log is overfilled with this log:

[11261.952732] ibmvscsi: found no event struct in pool!

This patch reorders the operations during host teardown. Start by calling
the SRP transport and Scsi_Host remove functions to flush any outstanding
work and set the host offline. LLDD teardown follows including destruction
of the event pool, freeing the Command Response Queue (CRQ), and unmapping
any persistent buffers. The event pool destruction is protected by the
scsi_host lock, and the pool is purged prior of any requests for which we
never received a response. Finally, move the removal of the scsi host from
our global list to the end so that the host is easily locatable for
debugging purposes during teardown.

Cc: <stable@xxxxxxxxxxxxxxx> # v2.6.12+
Signed-off-by: Tyrel Datwyler <tyreld@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Martin K. Petersen <martin.petersen@xxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
drivers/scsi/ibmvscsi/ibmvscsi.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)

--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -2299,17 +2299,27 @@ static int ibmvscsi_probe(struct vio_dev
static int ibmvscsi_remove(struct vio_dev *vdev)
{
struct ibmvscsi_host_data *hostdata = dev_get_drvdata(&vdev->dev);
- spin_lock(&ibmvscsi_driver_lock);
- list_del(&hostdata->host_list);
- spin_unlock(&ibmvscsi_driver_lock);
- unmap_persist_bufs(hostdata);
+ unsigned long flags;
+
+ srp_remove_host(hostdata->host);
+ scsi_remove_host(hostdata->host);
+
+ purge_requests(hostdata, DID_ERROR);
+
+ spin_lock_irqsave(hostdata->host->host_lock, flags);
release_event_pool(&hostdata->pool, hostdata);
+ spin_unlock_irqrestore(hostdata->host->host_lock, flags);
+
ibmvscsi_release_crq_queue(&hostdata->queue, hostdata,
max_events);

kthread_stop(hostdata->work_thread);
- srp_remove_host(hostdata->host);
- scsi_remove_host(hostdata->host);
+ unmap_persist_bufs(hostdata);
+
+ spin_lock(&ibmvscsi_driver_lock);
+ list_del(&hostdata->host_list);
+ spin_unlock(&ibmvscsi_driver_lock);
+
scsi_host_put(hostdata->host);

return 0;