Re: [RFC PATCH] scsi: libsas: fix WARN on device removal

From: wangyijing
Date: Tue Nov 22 2016 - 20:11:33 EST


>>
>> The events are not lost.
>
> In sas_queue_event(), if there is a particular event pending for a port/PHY, we cannot queue further same event types for that port/PHY. I think my colleagues found issue where we try to enqueue multiple complementary events.

Yesï we found this issue in our local tests.

>
>> The new problem this patch introduces is
>> delaying sas port deletion where it was previously immediate. So now
>> we can get into a situation where the port has gone down and can start
>> processing a port up event before the previous deletion work has run.
>>
>>>>
>>>>> And it's a very noisy warning, as in 6K lines on the console when an
>>>>> expander is unplugged.
>>>>
>>>>
>>>> Does something like this modulate the failure?
>>
>> I'm curious if we simply need to fix the double deletion of the
>> sas_port bsg queue, could you try the changes below?
>>
>
> No, I just tested it on a root port and we get the same WARN.
>
>>>>
>>>> diff --git a/drivers/scsi/scsi_transport_sas.c
>>>> b/drivers/scsi/scsi_transport_sas.c index
>>>> 60b651bfaa01..11401e5c88ba 100644
>>>> --- a/drivers/scsi/scsi_transport_sas.c
>>>> +++ b/drivers/scsi/scsi_transport_sas.c
>>>> @@ -262,9 +262,10 @@ static void sas_bsg_remove(struct Scsi_Host
>>>> *shost, struct sas_rphy *rphy
>>>> {
>>>> struct request_queue *q;
>>>>
>>>> - if (rphy)
>>>> + if (rphy) {
>>>> q = rphy->q;
>>>> - else
>>>> + rphy->q = NULL;
>>>> + } else
>>>> q = to_sas_host_attrs(shost)->q;
>>>>
>>>> if (!q)
>>>>
>>>> .
>>>>
>>>
>>>
>>
>> .
>>
>
>
>
> .
>