Re: [PATCH v4 06/10] scsi: ufs: Remove host_sem used in suspend/resume

From: Bart Van Assche
Date: Tue Jun 29 2021 - 14:01:59 EST


On 6/28/21 11:23 PM, Can Guo wrote:
> On 2021-06-29 01:31, Bart Van Assche wrote:
>> On 6/28/21 1:17 AM, Can Guo wrote:
>>> On 2021-06-25 01:11, Bart Van Assche wrote:
>>>> On 6/23/21 11:31 PM, Can Guo wrote:
>>>>> Using back host_sem in suspend_prepare()/resume_complete()
>>>>> won't have this problem of deadlock, right?
>>>>
>>>> Although that would solve the deadlock discussed in this email
>>>> thread, it wouldn't solve the issue of potential adverse
>>>> interactions of the UFS error handler and the SCSI error
>>>> handler running concurrently.
>>>
>>> I think I've explained it before, paste it here -
>>>
>>> ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and
>>> flushes it, so SCSI error handler and UFS error handler can
>>> safely run together.
>>
>> That code path is the exception. Do you agree that the following
>> three functions all invoke the ufshcd_err_handler() function
>> asynchronously? * ufshcd_uic_pwr_ctrl() * ufshcd_check_errors() *
>> ufshcd_abort()
>
> I agree, but I don't see what's wrong with that. Any context can
> invoke ufs error handler asynchronously and ufs error handler prepare
> makes sure error handler can work safely, i.e., stopping PM
> ops/gating/scaling in error handler prepare makes sure no one shall
> call ufshcd_uic_pwr_ctrl() ever again. And ufshcd_check_errors() and
> ufshcd_abort() are OK to run concurrently with UFS error handler.

The current UFS error handling approach requires the following code in
ufshcd_queuecommand():

if (hba->pm_op_in_progress) {
hba->force_reset = true;
set_host_byte(cmd, DID_BAD_TARGET);
cmd->scsi_done(cmd);
goto out;
}

Removing that code is not possible with the current error handling
approach. My patch makes it possible to remove that code.

> Sorry that I missed the change of scsi_transport_template() in your
> previous message. I can understand that you want to invoke UFS error
> hander by invoking SCSI error handler, but I didn't go that far
> because I saw you changed pm_runtime_get_sync() to
> pm_runtime_get_noresume() in ufs error handler prepare. How can that
> change make sure that the device is not suspending or resuming while
> error handler is running?

UFS power state transitions happen by submitting a SCSI command to a
WLUN. The SCSI error handler is only activated after all outstanding
SCSI commands for a SCSI host have failed or completed. I think this
guarantees for the UFS driver that eh_strategy_handler is not invoked
while a command submitted to a WLUN is changing the power state of the
UFS device. The following code from scsi_error.c only wakes up the error
handler if (shost->host_failed || shost->host_eh_scheduled) &&
shost->host_failed == scsi_host_busy(shost):

if ((shost->host_failed == 0 && shost->host_eh_scheduled == 0)
|| shost->host_failed != scsi_host_busy(shost)) {
schedule();
continue;
}
/* Handle SCSI errors */

Thanks,

Bart.