[PATCH] ahci: libahci: clear pending interrupt status

From: Szuying Chen
Date: Thu Aug 10 2023 - 05:32:02 EST


On 8/10/23 14:12, Damien Le Moal wrote:
> On 8/10/23 14:05, Szuying Chen wrote:
> > When ISR handle interface fatal error with error recovery after clear PxIS
> > and PxIE. Another FIS(SDB FIS with err) that set PxIS.IFS to 1 is recevied
> > during error recovery, which causing the HBA not issue any new commands
> > after cmd.ST set 1.
>
> This is not very clear. If there was a fatal error, the drive should be in
> error state and no other SDB FIS can be received as the drive does absolutely
> nothing while in error state (it only waits for a read log 10h command to be>
> issued to get it out of error state). So if you are seeing 2 SDB FIS with
> errors one after the other, you have a buggy device...
>
> However, I may be misunderstanding your issue here. Could you provide more
> details and a dmesg output example of the issue ?

According to AHCI 1.3.1 specification ch6.1.9, when an R_ERR is received
on an H2D Data FIS in normal operation, the HBA sets PxIS.IFS to 1
(fatal error) and halts operation. Referring to SATA 3.0 specification we
know the device will halt queued command processing and transmit SDB FIS to
host with ERR bit in Status field set to one(set PxIS.TFES to 1).

In our case, the ISR handles fatal errors(PxIS.IFS) and enters error
recovery after cleaning up PxIS and PxIE. Then a SDB FIS is received
with interrupt bit(PxIS.TFES) set to 1. According to AHCI 1.3.1
specification ch6.2.2, HBA can't issue(cmd.ST set to 1) any new commands
under PxIS.TFES alive during error recovery.

> >
> > Signed-off-by: Szuying Chen <Chloe_Chen@xxxxxxxxxxxxxx>
> > ---
> > drivers/ata/libahci.c | 12 ++++++++++++
> > 1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
> > index 06aec35f88f2..0ae51fd95d46 100644
> > --- a/drivers/ata/libahci.c
> > +++ b/drivers/ata/libahci.c
> > @@ -679,9 +679,21 @@ static int ahci_scr_write(struct ata_link *link, unsigned int sc_reg, u32 val)
> >
> > void ahci_start_engine(struct ata_port *ap)
> > {
> > + struct ahci_host_priv *hpriv = ap->host->private_data;
> > void __iomem *port_mmio = ahci_port_base(ap);
> > u32 tmp;
> >
> > + /* clear SError */
> > + tmp = readl(port_mmio + PORT_SCR_ERR);
> > + writel(tmp, port_mmio + PORT_SCR_ERR);
> > +
> > + /* clear port IRQ */
> > + tmp = readl(port_mmio + PORT_IRQ_STAT);
> > + if (tmp)
> > + writel(tmp, port_mmio + PORT_IRQ_STAT);
> > +
> > + writel(1 << ap->port_no, hpriv->mmio + PORT_IRQ_STAT);
> > +
> > /* start DMA */
> > tmp = readl(port_mmio + PORT_CMD);
> > tmp |= PORT_CMD_START;
> > --
> > 2.39.2
> >
>
> --
> Damien Le Moal
> Western Digital Research
>
Thanks.