Re: reply: reply: [RFC PATCH v3 00/19] scsi: scsi_error: Introduce new error handle mechanism

From: John Garry
Date: Tue Apr 01 2025 - 05:25:42 EST


On 01/04/2025 04:32, Jiangjianjun wrote:

Please use standard mailing practice of inlining response.

On 31/03/2025 04:10, Jiangjianjun wrote:
Sorry for late message! I'm working on fixing and testing these issues before re-emailing.
What are you actually working on?

It seems that Hannes' "scsi: EH rework, main part" series and maybe this one can help resolve this following issue:

https://urldefense.com/v3/__https://lore.kernel.org/linux-block/ eef1e927-c9b2-c61d-7f48-92e65d8b0418@xxxxxxxxxx/__;!!ACWV5N9M2RV99hQ! OO5I73jOVLARfumNZnn0L_cNWCWVmFHmNuzz74pUu12bSxsb7F1wQFuTJBCzEBZrdDE8cqBRf8e_Ddc4AmrbBgdRq9I$
with fix attempted in:

https://urldefense.com/v3/__https://lore.kernel.org/linux- ide/20241031140731.224589-4-cassel@xxxxxxxxxx/__;!!ACWV5N9M2RV99hQ! OO5I73jOVLARfumNZnn0L_cNWCWVmFHmNuzz74pUu12bSxsb7F1wQFuTJBCzEBZrdDE8cqBRf8e_Ddc4AmrbVJtRc_g$
so that we don't see "fixes" like:
https://urldefense.com/v3/__https://lore.kernel.org/linux- scsi/20250329073236.2300582-1-liyihang9@xxxxxxxxxx/T/ *m80bcb3f57fd176b7ce41b1f26e8560de6ad52c9d__;Iw!!ACWV5N9M2RV99hQ! OO5I73jOVLARfumNZnn0L_cNWCWVmFHmNuzz74pUu12bSxsb7F1wQFuTJBCzEBZrdDE8cqBRf8e_Ddc4AmrbNUtxBIA$
-----邮件原件-----
发件人: Christoph Hellwig<hch@xxxxxxxxxxxxx>
发送时间: 2025年3月20日 14:06
收件人: Hannes Reinecke<hare@xxxxxxx>
抄送: Jiangjianjun<jiangjianjun3@xxxxxxxxxx>;jejb@xxxxxxxxxxxxx;
martin.petersen@xxxxxxxxxx;linux-scsi@xxxxxxxxxxxxxxx;
linux-kernel@xxxxxxxxxxxxxxx; lixiaokeng<lixiaokeng@xxxxxxxxxx>;
hewenliang (C)<hewenliang4@xxxxxxxxxx>; Yangkunlin(Poincare)
<yangkunlin7@xxxxxxxxxx>
主题: Re: [RFC PATCH v3 00/19] scsi: scsi_error: Introduce new error
handle mechanism

On Fri, Mar 14, 2025 at 10:01:40AM +0100, Hannes Reinecke wrote:
3. The current EH framework is designed around 'struct scsi_cmnd'.
Which means that the command_initiating_ the error handling can only
be returned once the_entire_ error handling (with all
escalations) is finished. And more often than not, the application is
waiting on that command to be completed before the next I/O is sent.
And that really limits the effectiveness of any improved error
handler; the application ultimatively has to wait for a host reset
before it can contine.
And someone needs to get your old series to fix that merged before we even start talking about any major EH change.

Sorry, the previous engineer Wen Chao's work has changed. Now I will continue to complete this work. In the future.
I will analyze the details of the solution, improve and refine the above suggestions, and carefully submit the email.

JFYI, IIRC, that "scsi: EH rework, main part" or one of the prep series may require some form of SCSI reserved command support. Niklas raised that point here:
https://lore.kernel.org/linux-scsi/Zyo-E1PCvx_XULvg@ryzen/

I also remember commenting on this, but cannot find a reference.

The SCSI reserved commands series includes the following attempts:
https://lore.kernel.org/linux-scsi/20211125151048.103910-1-hare@xxxxxxx/
https://lore.kernel.org/linux-scsi/1666693096-180008-1-git-send-email-john.garry@xxxxxxxxxx/

Maybe to move format we can implement a basic solution for the concerned drivers, so that progress can be made.