[PATCH v2] scsi: ufs: Cleanup completed request without interrupt notification

From: Stanley Chu
Date: Thu Jul 02 2020 - 00:19:01 EST


If somehow no interrupt notification is raised for a completed request
and its doorbell bit is cleared by host, UFS driver needs to cleanup
its outstanding bit in ufshcd_abort().

Otherwise, system may crash by below abnormal flow:

After this request is requeued by SCSI layer with its
outstanding bit set, the next completed request will trigger
ufshcd_transfer_req_compl() to handle all "completed outstanding
bits". In this time, the "abnormal outstanding bit" will be detected
and the "requeued request" will be chosen to execute request
post-processing flow. This is wrong and blk_finish_request() will
BUG_ON because this request is still "alive".

It is worth mentioning that before ufshcd_abort() cleans the timed-out
request, driver needs to check again if this request is really not
handled by __ufshcd_transfer_req_compl() yet because it is possible
that its interrupt comes very lately before the cleaning.

Signed-off-by: Stanley Chu <stanley.chu@xxxxxxxxxxxx>
---
drivers/scsi/ufs/ufshcd.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index cadfa9006972..0f4f3255e403 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6462,7 +6462,7 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
/* command completed already */
dev_err(hba->dev, "%s: cmd at tag %d successfully cleared from DB.\n",
__func__, tag);
- goto out;
+ goto cleanup;
} else {
dev_err(hba->dev,
"%s: no response from device. tag = %d, err %d\n",
@@ -6496,9 +6496,14 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
goto out;
}

+cleanup:
+ spin_lock_irqsave(host->host_lock, flags);
+ if (!test_bit(tag, hba->outstanding_reqs)) {
+ spin_unlock_irqrestore(host->host_lock, flags);
+ goto out;
+ }
scsi_dma_unmap(cmd);

- spin_lock_irqsave(host->host_lock, flags);
ufshcd_outstanding_req_clear(hba, tag);
hba->lrb[tag].cmd = NULL;
spin_unlock_irqrestore(host->host_lock, flags);
--
2.18.0