Re: [REQUEST DISCUSS]: speed up SCSI error handle for host with massive devices
From: Wenchao Hao
Date: Thu Oct 12 2023 - 12:10:16 EST
On Thu, Oct 12, 2023 at 10:51 PM Hannes Reinecke <hare@xxxxxxx> wrote:
>
> On 4/6/22 11:40, Wenchao Hao wrote:
> > On 2022/4/4 13:28, Hannes Reinecke wrote:
> >> On 4/3/22 19:17, Mike Christie wrote:
> >>> On 4/3/22 12:14 PM, Mike Christie wrote:
> >>>> We could share code with scsi_ioctl_reset as well. Drivers that support
> >>>> TMFs via that ioctl already expect queuecommand to be possibly in the
> >>>> middle of a run and IO not yet timed out. For example, the code to
> >>>> block a queue and reset the device could be used for the new EH and
> >>>> SG_SCSI_RESET_DEVICE handling.
> >>>>
> >>>
> >>> Hannes or others,
> >>>
> >>> How do parallel SCSI drivers support scsi_ioctl_reset? Is is not fully
> >>> supported and more only used for controlled testing?
> >>
> >> That's actually a problem in scsi_ioctl_reset(); it really should wait
> >> for all I/O to quiesce. Currently it just sets the 'tmf' flag and calls
> >> into the various reset functions.
> >>
> >> But really, I'd rather get my EH rework in before we're start discussing
> >> modifying EH behaviour.
> >> Let me repost it ...
> >>
> >
> > Would you take fast EH(such as single LUN reset) into consideration, maybe
> > a second but lightweight EH? It means a lot.
> >
> > Or give a way drivers can branch out the general timeout and EH handle logic?
>
> (Re-reading the thread:)
>
> If it's just about device reset I guess we can implement an asynchronous
> version. Based on my EH rework we could / should do:
>
> Have a 'eh_cmd_q' list per 'struct scsi_device' and 'struct
> scsi_target'. So Instead of always moving a failed command to the
> 'eh_cmq_q' list of the host, move it onto the list of the next higher
> level (eg a failed abort would move it to the eh_cmq_q of 'struct
> scsi_device', a failed device reset would move it to the eh_cmq_q of
> 'struct scsi_target' etc).
> That would actually make the code in SCSI EH easier to read as we
> could do away with constantly moving and splitting the per-host
> eh_cmq_q list.
>
> And then, as a second step, implement a new eh callback for
> asynchronous SCSI device aborts. That callback would need to
> stop I/O to the device first, send the TMF, and either
> restart the device upon successful completion or splice
> the list of failed commands onto the target and call
> the normal escalation with skipping eh_device_reset().
>
Yes, the RFC patch I sent before is based on this idea, and the details
of implementation may be different from what you described:
Here are how I did:
Three key operations are abstracted for scsi_device and scsi_target:
- mark device/target recovery: called in command dispatch path to stop I/O
- adding error command: called after abort failed to add error command
to error list
- waking up error handling : called in scsi_device_unbusy() to wake up
error handling work
Add struct scsi_device_eh and scsi_target_eh that encapsulate 3
callbacks for the above three key operations above, and invokes these 3
callbacks in the process mentioned above.
For details, please refer to the patch I posted before:
https://lore.kernel.org/linux-scsi/20230901094127.2010873-2-haowenchao2@xxxxxxxxxx/
The following two patches implement each of the three previously defined
callback functions, using kernel work to implement asynchrony handle.
https://lore.kernel.org/linux-scsi/20230901094127.2010873-9-haowenchao2@xxxxxxxxxx/
https://lore.kernel.org/linux-scsi/20230901094127.2010873-12-haowenchao2@xxxxxxxxxx/
For example, define following struct for error handling of scsi_device:
struct scsi_lun_eh {
spinlock_t eh_lock;
unsigned int eh_num;
struct list_head eh_cmd_q;
struct scsi_device *sdev;
struct work_struct eh_handle_work;
unsigned int fallback:1; /* If fallback to further */
/* recovery on failure */
};
The processing logic after awakening is as follows:
sdev_eh_work()
{
try device reset
if device reset succeed(including TUR)
mark error command of this device handled
else if fallback flag is false
finish commands and mark this device offline
fallback to target/host recovery
}
A flag fallback is defined here to determine whether to continue the advanced
reset after the device reset failed, because some drivers actually only
define the callback of the device reset, it is meaningless to continue
advanced reset for such drivers.
Note: the version I posted has bugs in adding commands which would be fixed
in the next version.
Looking for your response, thanks.
> Hmm?
>
> Cheers,
>
> Hannes
>