Re: [PATCH RFC 2/2] scsi: ufshcd: Fix device links when BOOT WLUN fails to probe

From: Adrian Hunter
Date: Thu Jul 08 2021 - 12:02:49 EST


On 8/07/21 6:12 pm, Rafael J. Wysocki wrote:
> On Thu, Jul 8, 2021 at 5:03 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
>>
>> On Thu, Jul 8, 2021 at 4:17 PM Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote:
>>>
>>> On 8/07/21 3:31 pm, Rafael J. Wysocki wrote:
>>>> On Wed, Jul 7, 2021 at 7:49 PM Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote:
>>>>>
>>>>> On 7/07/21 8:39 pm, Greg Kroah-Hartman wrote:
>>>>>> On Wed, Jul 07, 2021 at 08:29:48PM +0300, Adrian Hunter wrote:
>>>>>>> If a LUN fails to probe (e.g. absent BOOT WLUN), the device will not have
>>>>>>> been registered but can still have a device link holding a reference to the
>>>>>>> device. The unwanted device link will prevent runtime suspend indefinitely,
>>>>>>> and cause some warnings if the supplier is ever deleted (e.g. by unbinding
>>>>>>> the UFS host controller). Fix by explicitly deleting the device link when
>>>>>>> SCSI destroys the SCSI device.
>>>>>>>
>>>>>>> Signed-off-by: Adrian Hunter <adrian.hunter@xxxxxxxxx>
>>>>>>> ---
>>>>>>> drivers/scsi/ufs/ufshcd.c | 7 +++++++
>>>>>>> 1 file changed, 7 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>>>>>>> index 708b3b62fc4d..483aa74fe2c8 100644
>>>>>>> --- a/drivers/scsi/ufs/ufshcd.c
>>>>>>> +++ b/drivers/scsi/ufs/ufshcd.c
>>>>>>> @@ -5029,6 +5029,13 @@ static void ufshcd_slave_destroy(struct scsi_device *sdev)
>>>>>>> spin_lock_irqsave(hba->host->host_lock, flags);
>>>>>>> hba->sdev_ufs_device = NULL;
>>>>>>> spin_unlock_irqrestore(hba->host->host_lock, flags);
>>>>>>> + } else {
>>>>>>> + /*
>>>>>>> + * If a LUN fails to probe (e.g. absent BOOT WLUN), the device
>>>>>>> + * will not have been registered but can still have a device
>>>>>>> + * link holding a reference to the device.
>>>>>>> + */
>>>>>>> + device_links_scrap(&sdev->sdev_gendev);
>>>>>>
>>>>>> What created that link? And why did it do that before probe happened
>>>>>> successfully?
>>>>>
>>>>> The same driver created the link.
>>>>>
>>>>> The documentation seems to say it is allowed to, if it is the consumer.
>>>>> From Documentation/driver-api/device_link.rst
>>>>>
>>>>> Usage
>>>>> =====
>>>>>
>>>>> The earliest point in time when device links can be added is after
>>>>> :c:func:`device_add()` has been called for the supplier and
>>>>> :c:func:`device_initialize()` has been called for the consumer.
>>>>
>>>> Yes, this is allowed, but if you've added device links to a device
>>>> object that is not going to be registered after all, you are
>>>> responsible for doing the cleanup.
>>>>
>>>> Why can't you call device_link_del() directly on those links?
>>>>
>>>> Or device_link_remove() if you don't want to deal with link pointers?
>>>>
>>>
>>> Those only work for DL_FLAG_STATELESS device links, but we use only
>>> DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE flags.
>>
>> So I'd probably modify device_link_remove() to check if the consumer
>> device has been registered and run __device_link_del() directly
>> instead of device_link_put_kref() if it hasn't.
>>
>> Or add an argument to it to force the removal.
>
> Or even modify device_link_put_kref() like this:
>
> static void device_link_put_kref(struct device_link *link)
> {
> if (link->flags & DL_FLAG_STATELESS)
> kref_put(&link->kref, __device_link_del);
> + else if (!device_is_registered(link->consumer))
> + __device_link_del(link);
> else
> WARN(1, "Unable to drop a managed device link reference\n");
> }
>

Thanks! :-) I will do that.