RE: PATCH 1/5] scsi: megaraid_sas - Add Online Controller Reset toMegaRAID SAS drive

From: Yang, Bo
Date: Mon Oct 11 2010 - 11:37:48 EST


Tomas,

The reason the driver does flush_scheduled_work() is driver did cancel_delayed_work(). I am not sure driver need to call flush_scheduled_work() if the scheduled work already done, but I will test this changes and submit another patch as soon as it works fine.

This change should not be the part of this patch, we would have the new patch submit after the verification.

Regards,

Bo Yang

-----Original Message-----
From: Tomas Henzl [mailto:thenzl@xxxxxxxxxx]
Sent: Monday, October 11, 2010 9:21 AM
To: Yang, Bo
Cc: James Bottomley; bo yang; linux-scsi@xxxxxxxxxxxxxxx; akpm@xxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
Subject: Re: PATCH 1/5] scsi: megaraid_sas - Add Online Controller Reset to MegaRAID SAS drive

On 10/11/2010 02:55 PM, Yang, Bo wrote:
> Tomas,
>
>
>> Another correction - flush_scheduled_work is already present in megass_detach_one
>> it only should be moved away from the if statement.
>>
> The flush_scheduled_work is for schedule_delayed_work(). We need to flush it and remove.
>
I'm not saying it should be removed, I think a move outside from the 'if' block
makes it work for ev->hotplug_work and for the newly added instance->work_init


diff --git a/drivers/scsi/megaraid/megaraid_sas.c b/drivers/scsi/megaraid/megaraid_sas.c
index 55951f4..7773707 100644
--- a/drivers/scsi/megaraid/megaraid_sas.c
+++ b/drivers/scsi/megaraid/megaraid_sas.c
@@ -4088,9 +4088,9 @@ static void __devexit megasas_detach_one(struct pci_dev *pdev)
struct megasas_aen_event *ev = instance->ev;
cancel_delayed_work(
(struct delayed_work *)&ev->hotplug_work);
- flush_scheduled_work();
- instance->ev = NULL;
}
+ flush_scheduled_work();
+ instance->ev = NULL;

tasklet_kill(&instance->isr_tasklet);

--

> -----Original Message-----
> From: Tomas Henzl [mailto:thenzl@xxxxxxxxxx]
> Sent: Saturday, October 09, 2010 4:38 PM
> To: James Bottomley
> Cc: bo yang; linux-scsi@xxxxxxxxxxxxxxx; akpm@xxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Yang, Bo
> Subject: Re: PATCH 1/5] scsi: megaraid_sas - Add Online Controller Reset to MegaRAID SAS drive
>
> On 10/08/2010 09:28 PM, Tomas Henzl wrote:
>
>> On 10/08/2010 06:39 PM, James Bottomley wrote:
>>
>>
>>> On Fri, 2010-10-08 at 17:51 +0200, Tomas Henzl wrote:
>>>
>>>
>>>
>>>> On 09/23/2010 04:36 AM, bo yang wrote:
>>>>
>>>>
>>>>
>>>>> This patch is too big. I am using attachment to submit. Please
>>>>> use attached file to apply. Also let me know if it can't be accepted.
>>>>>
>>>>> To add the Online controller reset support, driver need to do:
>>>>> a). reset the controller chips -- Xscale and Gen2 which will change
>>>>> the function calls and add the reset function related to this two
>>>>> chips.
>>>>> b). during the reset, driver will store the pending cmds which not
>>>>> returned by FW to driver's pending queue. Driver will re-issue those
>>>>> pending cmds again to FW after the OCR finished.
>>>>> c). In driver's timeout routine, driver will report to OS as reset.
>>>>> Also driver's queue routine will block the cmds until the OCR
>>>>> finished.
>>>>> d). in Driver's ISR routine, if driver get the FW state as state
>>>>> change, FW in Failure status and FW support online controller
>>>>> reset (OCR), driver will start to do the controller reset.
>>>>> e). In driver's IOCTL routine, the application cmds will wait for the
>>>>> OCR to finish, then issue the cmds to FW.
>>>>>
>>>>> Signed-off-by Bo Yang<bo.yang@xxxxxxx>
>>>>>
>>>>> ---
>>>>> drivers/scsi/megaraid/megaraid_sas.c | 756 ++++++++++++++++++++++++++++++++---
>>>>> drivers/scsi/megaraid/megaraid_sas.h | 88 +++-
>>>>> 2 files changed, 787 insertions(+), 57 deletions(-)
>>>>>
>>>>>
>>>>>
>>>> Hi Bo,
>>>> in the workqueue function you sleep for 30s,
>>>> it's scheduled here - schedule_work(&instance->work_init);
>>>>
>>>> +process_fw_state_change_wq(struct work_struct *work)
>>>> +{
>>>> ...
>>>> + /*waitting for about 20 second before start the second init*/
>>>> + for (wait = 0; wait < 30; wait++) {
>>>> + msleep(1000);
>>>> + }
>>>>
>>>>
>>>>
>>> this lot should be ssleep(20) if you want a 20 sec sleep.
>>>
>>>
>>>
>> please do that on every place where you use the
>> "for (wait = 0; wait < n; wait++) msleep(1000);" construction
>>
>>
>>
>>>> - this is not a good practice to sleep for a so long time I think
>>>>
>>>>
>>>>
>> this long sleep might might be ok, if the workqueue is used only rarely
>> is it so?
>>
>>
>>
>>>> - you should use in your exit function some synchronization
>>>> for example 'cancel_work_sync', without that if someone rmmods your
>>>> module, it could then lead to a memory corruption
>>>>
>>>>
>>>>
>>> Actually flush_scheduled_work() should be fine ... it will force the
>>> module removal to wait for completion ... cancellation can be error
>>> prone, so just forcing the wait sounds easier.
>>>
>>>
>>>
> Another correction - flush_scheduled_work is already present in megass_detach_one
> it only should be moved away from the if statement.
>
>
>
>> someone told that cancel_work_sync is safer then flush_scheduled_work
>> but I'm not an expert, so ok
>>
>> Tomas
>>
>>
>>
>>> James
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
> NïïïïïrïïyïïïbïXïïÇvï^ï)Þ{.nï+ïïïï{ïïï"ï{ayïÊÚï,j
ïïfïïïhïïïzïïwïïï
ïïïj:+vïïïwïjïmïïïï
ïïïïzZ+ïïÝj"ïï!tml=

èº{.nÇ+‰·Ÿ®‰­†+%ŠËlzwm…ébëæìr¸›zX§»®w¥Š{ayºÊÚë,j­¢f£¢·hš‹àz¹®w¥¢¸ ¢·¦j:+v‰¨ŠwèjØm¶Ÿÿ¾«‘êçzZ+ƒùšŽŠÝj"ú!¶iO•æ¬z·švØ^¶m§ÿðà nÆàþY&—