[PATCH v1] ufs: core: complete wl runtime resume after SCSI EH
From: Hongjie Fang
Date: Tue May 26 2026 - 07:56:39 EST
A PM START STOP sent from the UFS well-known LU resume path can race with
SCSI EH.
The "wl resume" task flow is:
__ufshcd_wl_resume()
ufshcd_set_dev_pwr_mode(UFS_ACTIVE_PWR_MODE)
ufshcd_execute_start_stop()
scsi_execute_cmd()
blk_execute_rq <-- wait
scsi_check_passthrough() <-- may retry START STOP
If the first START STOP time out, SCSI EH may already recover the link and
reset the device before scsi_execute_cmd() returns:
scsi_timeout()
scsi_eh_scmd_add()
scsi_error_handler()
scsi_unjam_host()
scsi_eh_ready_devs()
scsi_eh_host_reset()
ufshcd_eh_host_reset_handler()
if (hba->pm_op_in_progress)
ufshcd_link_recovery()
ufshcd_device_reset()
ufshcd_host_reset_and_restore()
...
scsi_eh_flush_done_q() <-- wakeup "wl resume" task
... <-- host still in SHOST_RECOVERY
scsi_restart_operations()
A later passthrough retry can then run while the host is still in
SHOST_RECOVERY and hit the SCMD_FAIL_IF_RECOVERING path:
scsi_queue_rq()
if (scsi_host_in_recovery(shost) &&
cmd->flags & SCMD_FAIL_IF_RECOVERING)
return BLK_STS_OFFLINE
That retry completes with DID_ERROR or DID_NO_CONNECT even though EH may
already have restored the device to an operational ACTIVE state.
In this case __ufshcd_wl_resume() can return -EIO even though recovery
has already restored the device to an operational ACTIVE state. This is
especially harmful for runtime PM because callback errors other than
-EAGAIN and -EBUSY are latched by the PM core as runtime_error.
After a runtime-PM START STOP returns -EIO, wait for any ongoing SCSI EH
round to finish and treat the resume as successful if recovery has
already restored:
- an online device
- UFSHCD_STATE_OPERATIONAL
- an active link
- cached ACTIVE device power mode
Fixes: b294ff3e3449 ("scsi: ufs: core: Enable power management for wlun")
Signed-off-by: Hongjie Fang <hongjiefang@xxxxxxxxxxxx>
---
drivers/ufs/core/ufshcd.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index c3f08957d179..e7517cf23f06 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -10368,6 +10368,22 @@ static int __ufshcd_wl_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op)
}
#ifdef CONFIG_PM
+static int ufshcd_wl_resume_pm_recovered(struct ufs_hba *hba)
+{
+ int ret = 0;
+ struct scsi_device *sdp = hba->ufs_device_wlun;
+
+ if (!sdp || !scsi_block_when_processing_errors(sdp))
+ return 0;
+
+ if (hba->ufshcd_state == UFSHCD_STATE_OPERATIONAL &&
+ ufshcd_is_link_active(hba) &&
+ ufshcd_is_ufs_dev_active(hba))
+ ret = 1;
+
+ return ret;
+}
+
static int __ufshcd_wl_resume(struct ufs_hba *hba, enum ufs_pm_op pm_op)
{
int ret;
@@ -10422,6 +10438,9 @@ static int __ufshcd_wl_resume(struct ufs_hba *hba, enum ufs_pm_op pm_op)
if (!ufshcd_is_ufs_dev_active(hba)) {
ret = ufshcd_set_dev_pwr_mode(hba, UFS_ACTIVE_PWR_MODE);
+ if (pm_op == UFS_RUNTIME_PM && ret == -EIO &&
+ ufshcd_wl_resume_pm_recovered(hba))
+ ret = 0;
if (ret)
goto set_old_link_state;
ufshcd_set_timestamp_attr(hba);
--
2.25.1