Re: [PATCH v4] ata: libata-eh: Honor all EH scheduling requests

From: Li Nan
Date: Mon Sep 11 2023 - 18:24:30 EST




在 2023/9/11 15:16, Damien Le Moal 写道:
On 9/8/23 06:02, Niklas Cassel wrote:
On Thu, Sep 07, 2023 at 03:43:19PM +0800, kernel test robot wrote:

I am not confident that playing with host_eh_schedule count is the right

Yeah, If there is some way can wake up EH without inc host_eh_schedule,
dec count becomes risky.

approach. A better solution may be to change the timing of clearing
ATA_PFLAG_EH_PENDING. Right now, this is done on entry to
ata_scsi_port_error_handler(), unconditionally. So ata_eh_reset() should not
need to clear the flag again. If we remove that, then a new interrupt received
after ata_eh_thaw() and setting EH_PENDING would be cought by the retry loop in
ata_scsi_port_error_handler(), which would run again ap->ops->error_handler(ap).

So let's try this fix instead:

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 159ba6ba19eb..d1d081aa0c95 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -2807,7 +2807,6 @@ int ata_eh_reset(struct ata_link *link, int classify,
memset(&link->eh_info, 0, sizeof(link->eh_info));
if (slave)
memset(&slave->eh_info, 0, sizeof(link->eh_info));
- ap->pflags &= ~ATA_PFLAG_EH_PENDING;
spin_unlock_irqrestore(link->ap->lock, flags);

if (ata_port_is_frozen(ap))

Li,

Can you please test this ?


OK, Let us test this patch.



It might be worth mentioning that the race window for the bug that the patch
in $subject is fixing, should be much smaller after this patch is in:
https://lore.kernel.org/linux-ide/20230907081710.4946-1-Chloe_Chen@xxxxxxxxxxxxxx/

Li Nan, perhaps you could see if you can still reproduce your original
problem with the patch from the ASMedia guys?



However, even with the ASMedia patch, it should still be theoretically
possible to get an error irq after ata_eh_reset() has called ahci_thaw(),
so I suppose that this patch still makes some sense...


Kind regards,
Niklas


--
Thanks,
Nan