[PATCH 4.14 25/59] scsi: core: Synchronize request queue PM status only on successful resume

From: Greg Kroah-Hartman
Date: Mon Jan 21 2019 - 09:14:25 EST


4.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Stanley Chu <stanley.chu@xxxxxxxxxxxx>

commit 3f7e62bba0003f9c68f599f5997c4647ef5b4f4e upstream.

The commit 356fd2663cff ("scsi: Set request queue runtime PM status back to
active on resume") fixed up the inconsistent RPM status between request
queue and device. However changing request queue RPM status shall be done
only on successful resume, otherwise status may be still inconsistent as
below,

Request queue: RPM_ACTIVE
Device: RPM_SUSPENDED

This ends up soft lockup because requests can be submitted to underlying
devices but those devices and their required resource are not resumed.

For example,

After above inconsistent status happens, IO request can be submitted to UFS
device driver but required resource (like clock) is not resumed yet thus
lead to warning as below call stack,

WARN_ON(hba->clk_gating.state != CLKS_ON);
ufshcd_queuecommand
scsi_dispatch_cmd
scsi_request_fn
__blk_run_queue
cfq_insert_request
__elv_add_request
blk_flush_plug_list
blk_finish_plug
jbd2_journal_commit_transaction
kjournald2

We may see all behind IO requests hang because of no response from storage
host or device and then soft lockup happens in system. In the end, system
may crash in many ways.

Fixes: 356fd2663cff (scsi: Set request queue runtime PM status back to active on resume)
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Stanley Chu <stanley.chu@xxxxxxxxxxxx>
Reviewed-by: Bart Van Assche <bvanassche@xxxxxxx>
Signed-off-by: Martin K. Petersen <martin.petersen@xxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
drivers/scsi/scsi_pm.c | 26 +++++++++++++++-----------
1 file changed, 15 insertions(+), 11 deletions(-)

--- a/drivers/scsi/scsi_pm.c
+++ b/drivers/scsi/scsi_pm.c
@@ -79,8 +79,22 @@ static int scsi_dev_type_resume(struct d

if (err == 0) {
pm_runtime_disable(dev);
- pm_runtime_set_active(dev);
+ err = pm_runtime_set_active(dev);
pm_runtime_enable(dev);
+
+ /*
+ * Forcibly set runtime PM status of request queue to "active"
+ * to make sure we can again get requests from the queue
+ * (see also blk_pm_peek_request()).
+ *
+ * The resume hook will correct runtime PM status of the disk.
+ */
+ if (!err && scsi_is_sdev_device(dev)) {
+ struct scsi_device *sdev = to_scsi_device(dev);
+
+ if (sdev->request_queue->dev)
+ blk_set_runtime_active(sdev->request_queue);
+ }
}

return err;
@@ -139,16 +153,6 @@ static int scsi_bus_resume_common(struct
else
fn = NULL;

- /*
- * Forcibly set runtime PM status of request queue to "active" to
- * make sure we can again get requests from the queue (see also
- * blk_pm_peek_request()).
- *
- * The resume hook will correct runtime PM status of the disk.
- */
- if (scsi_is_sdev_device(dev) && pm_runtime_suspended(dev))
- blk_set_runtime_active(to_scsi_device(dev)->request_queue);
-
if (fn) {
async_schedule_domain(fn, dev, &scsi_sd_pm_domain);