Re: [PATCH v1 1/2] scsi: ufs: Serialize eh_work with system PM events and async scan

From: Can Guo
Date: Tue Oct 20 2020 - 21:19:02 EST


On 2020-10-20 22:19, Bean Huo wrote:
On Mon, 2020-09-21 at 22:32 -0700, Can Guo wrote:
Serialize eh_work with system PM events and async scan to make sure
eh_work
does not run in parallel with them.

Change-Id: I33012c68e2ea443950313c59a4a46ad88cf3c82d
Signed-off-by: Can Guo <cang@xxxxxxxxxxxxxx>
---
drivers/scsi/ufs/ufshcd.c | 64 +++++++++++++++++++++++++++++------
------------
drivers/scsi/ufs/ufshcd.h | 1 +
2 files changed, 41 insertions(+), 24 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 1d8134e..7e764e8 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5597,7 +5597,9 @@ static inline void
ufshcd_schedule_eh_work(struct ufs_hba *hba)
static void ufshcd_err_handling_prepare(struct ufs_hba *hba)
{
pm_runtime_get_sync(hba->dev);
- if (pm_runtime_suspended(hba->dev)) {
+ if (pm_runtime_status_suspended(hba->dev) || hba-
>is_sys_suspended) {

Hi Can

I curiously want to know how this can be produced in real system.

IMO, The system has been in system PM suspeded mode, how can trigger
error handler? because the tasks have been freezed, the device
interrupts disabled, before entering system PM suspended mode, the
tasks in the queue should be completed, otherwises, there is bug in the
whole flow.


thanks,
Bean

Hi Bean,

You might misunderstand here - the hba->is_sys_suspended check is
not for the case that system has entered suspend, but for the case a resume
failure happens. If system resume fails, hba->is_sys_suspended is set and
powers/clks/IRQs are OFF, so we need to prepare the environments for err_handler to run.
To reproduce this scenario, you can fake a hibern8 exit failure during system resume.

If the whole system has entered suspend, of course err_handler won't run.
I guess your real concern is that if UFS has entered system suspend (not the whole system),
but err_handling was somehow scheduled (most likely due to an non-fatal error).
This definitely is a problem and it is also why I make this change. With this change,
if UFS has entered system suspend, the eh_sem lock is held, err_handler won't even get
a chance to run, the err_handler will run only after UFS is resumed.

Thanks,

Can Guo.