Re: [PATCH v10 1/2] scsi: ufs: Enable power management for wlun

From: Asutosh Das (asd)
Date: Wed Mar 10 2021 - 11:40:58 EST


On 3/10/2021 8:27 AM, Alan Stern wrote:
On Tue, Mar 09, 2021 at 08:04:53PM -0800, Asutosh Das (asd) wrote:
On 3/9/2021 7:14 PM, Alan Stern wrote:
On Tue, Mar 09, 2021 at 07:04:34PM -0800, Asutosh Das (asd) wrote:
Hello
I & Can (thanks CanG) debugged this further:

Looks like this issue can occur if the sd probe is asynchronous.

Essentially, the sd_probe() is done asynchronously and driver_probe_device()
invokes pm_runtime_get_suppliers() before invoking sd_probe().

But scsi_probe_and_add_lun() runs in a separate context.
So the scsi_autopm_put_device() invoked from scsi_scan_host() context
reduces the link->rpm_active to 1. And sd_probe() invokes
scsi_autopm_put_device() and starts a timer. And then driver_probe_device()
invoked from __device_attach_async_helper context reduces the
link->rpm_active to 1 thus enabling the supplier to suspend before the
consumer suspends.

I don't see a way around this. Please let me know if you
(@Alan/@Bart/@Adrian) have any thoughts on this.

How about changing the SCSI core so that it does a runtime_get before
starting an async probe, and the async probe routine does a
runtime_put when it is finished? In other words, don't allow a device
to go into runtime suspend while it is waiting to be probed.

I don't think that would be too intrusive.

Alan Stern


Hi Alan
Thanks for the suggestion.

Am trying to understand:

Do you mean something like this:

int scsi_sysfs_add_sdev(struct scsi_device *sdev)
{

scsi_autopm_get_device(sdev);
pm_runtime_get_noresume(&sdev->sdev_gendev);
[...]
scsi_autopm_put_device(sdev);
[...]
}

static int sd_probe(struct device *dev)
{
[...]
pm_runtime_put_noidle(dev);
scsi_autopm_put_device(sdp);
[...]
}

This may work (I'm limited by my imagination in scsi layer :) ).

I'm not sure about this. To be honest, I did not read the entirety of
your last message; it had way too much detail. THere's a time and place
for that, but when you're brainstorming to figure out the underlying
cause of a problem and come up with a strategy to fix it, you want to
concentrate on the overall picture, not the details.

As I understand the situation, you've get a SCSI target with multiple
logical units, let's say A and B, and you need to make sure that A never
goes into runtime suspend unless B is already suspended. In other
words, B always has to suspend before A and resume after A.

To do this, you register a device link with A as the supplier and B as
the consumer. Then the PM core takes care of the ordering for you.

But I don't understand when you set up the device link. If the timing
is wrong then, thanks to async SCSI probing, you may have a situation
where A is registered before B and before the link is set up. Then
there's temporarily nothing to stop A from suspending before B.

You also need to prevent each device from suspending before it is
probed. That's the easy part I was trying to address before (although
it may not be so easy if the drivers are in loadable modules and not
present in the kernel).

You need to think through these issues before proposing actual changes.

But the pm_runtime_put_noidle() would have to be added to all registered
scsi_driver{}, perhaps? Or may be I can check for sdp->type?

Like this; it's too early to worry about this sort of thing.

Alan Stern

Hi Alan
Thanks. Understood.

I will check the details and see if I can come up with something.
I'll propose an alternate fix otherwise and drop this change altogether.

Thanks!
-asd

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
Linux Foundation Collaborative Project