Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established

From: yangxingui

Date: Sun Apr 26 2026 - 21:51:28 EST




On 2026/4/26 6:53, Damien Le Moal wrote:
On 4/25/26 15:04, Xingui Yang wrote:
When sata_link_hardreset() detects that the link is offline, it currently
returns immediately without distinguishing the reason. According to SATA
specification, the SStatus register's det filed (bits 0-3) indicates:
- 0x0: No device detected, PHY not communicating
- 0x1: Device detected but PHY communication not established
- 0x3: Device detected and PHY communication established

This patch helps improve device detection reliability and adds a check
when the link is offline but det filed shows 0x1, return -EAGAIN to
trigger retry, rather than giving up immediately.

Signed-off-by: Xingui Yang <yangxingui@xxxxxxxxxx>

This is a pure ATA patch so please CC the linux-ide list, not the linux-scsi list.

Ok.

Also, please check your mail setup: your email was in my Junk folder.

Well, patche was sent using the git send command.


---
drivers/ata/libata-sata.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
index b9d635088f5f..e5bb92c38e38 100644
--- a/drivers/ata/libata-sata.c
+++ b/drivers/ata/libata-sata.c
@@ -667,8 +667,18 @@ int sata_link_hardreset(struct ata_link *link, const unsigned int *timing,
if (rc)
goto out;
/* if link is offline nothing more to do */
- if (ata_phys_link_offline(link))
+ if (ata_phys_link_offline(link)) {

This is preceeded by a call to sata_link_resume(), which calls
sata_link_debounce() and that function makes sure that DET is stable. So if
after that DET still shows that their is no PHY, there is likely a big problem
with it and it is super slow to be established.

In this case, I do not think that doing another hardreset is the right thing to
do. Have you tried increasing the deadline for hardreset ? That deadline is used
as the limit for the link debounce too.

Do you have a specific controller/device where you see this issue ? What exactly
is the hardware setup where you see this issue ?

Our customer imports and verifies a new disk, there is an occasional failure in performing a hard reset on the disk and no exception log is generated for resume and debounce.

[ 22.864418][ T1285] ahci 0000:76:03.0: Adding to iommu group 23
[ 22.870403][ T1285] ahci 0000:76:03.0: controller does not support SXS, disabling CAP_SXS
[ 22.878655][ T1285] ahci 0000:76:03.0: SSS flag set, parallel bus scan disabled
[ 22.885966][ T1285] ahci 0000:76:03.0: AHCI 0001.0300 32 slots 2 ports 6 Gbps 0x3 impl SATA mode
[ 22.894743][ T1285] ahci 0000:76:03.0: flags: 64bit ncq sntf stag pm led clo only pmp fbs slum part ccc ems boh
[ 22.905277][ T1285] scsi host0: ahci
[ 22.909061][ T1285] scsi host1: ahci
[ 22.966463][ T1285] ata1: SATA max UDMA/133 abar m4096@0xa3010000 port 0xa3010100 irq 108
[ 22.974629][ T1285] ata2: SATA max UDMA/133 abar m4096@0xa3010000 port 0xa3010180 irq 109
[ 25.242373][ T1286] ata1: SATA link down (SStatus 1 SControl 300) <==============
[ 25.659901][ T1288] ata2: SATA link down (SStatus 0 SControl 300)



+ u32 sstatus;
+
+ if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
+ (sstatus & 0xf) == 0x1) {
+ ata_link_warn(link, "device detected but PHY not ready (SStatus %X), retrying\n",
+ sstatus);
+ rc = -EAGAIN;
+ }
+
goto out;
+ }
/* Link is online. From this point, -ENODEV too is an error. */
if (online)



Thanks,
Xingui