Re: [PATCH V3] scsi: storvsc: Allow only one remove lun work item to be issued per lun
From: Martin K. Petersen
Date: Fri Nov 03 2017 - 12:29:19 EST
Cathy,
> When running multipath on a VM if all available paths go down the
> driver can schedule large amounts of storvsc_remove_lun work items to
> the same lun. In response to the failing paths typically storvsc
> responds by taking host->scan_mutex and issuing a TUR per lun. If
> there has been heavy IO to the failed device all the failed IOs are
> returned from the host. A remove lun work item is issued per failed
> IO. If the outstanding TURs have not been completed in a timely manner
> the scan_mutex is never released or released too late. Consequently
> the many remove lun work items are not completed as scsi_remove_device
> also tries to take host->scan_mutex. This results in dragging the VM
> down and sometimes completely.
>
> This patch only allows one remove lun to be issued to a particular lun
> while it is an instantiated member of the scsi stack.
Applied to 4.15/scsi-queue.
Next time the change log needs to go after a "---" delimiter.
Thank you!
> Changes since v1:
> Use single threaded workqueue to serialize work in
> storvsc_handle_error [Christoph Hellwig]
>
> Changes since v2:
> Replaced create_singlethread_workqueue with
> alloc_ordered_workqueue [Christoph Hellwig]
>
> Added reviewed by's.
--
Martin K. Petersen Oracle Linux Engineering