Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established

From: yangxingui

Date: Thu May 28 2026 - 03:02:00 EST




On 2026/4/30 16:46, Niklas Cassel wrote:
On Wed, Apr 29, 2026 at 03:01:48PM +0800, yangxingui wrote:
This is preceeded by a call to sata_link_resume(), which calls
sata_link_debounce() and that function makes sure that DET is stable. So if
after that DET still shows that their is no PHY, there is likely a big problem
with it and it is super slow to be established.

I agree with Damien, sata_link_debounce() is supposed to make sure that
DET is stable.


sata_link_debounce() will not explicitly wait for SStatus.DET to turn 0x3.

If value is stable, and SStatus.DET == 1, and time is before "deadline",
sata_link_debounce() will continue looping.

Else, if value is stable, and has been stable for "duration" amount of time,
it will return.

Since your print shows that SStatus == 1, that most likely means that the
deadline expired in sata_link_debounce().

I suggest that you try to increase the deadline, perhaps start off by simply
multiplying it by some factor in sata_link_debounce().


It would also be helpful if your commit message explained why returning
-EAGAIN makes a difference, because from what I can see, if the deadline
expires, sata_link_debounce() returns 0, which should cause sata_link_resume()
to return 0, which should cause sata_link_hardreset() to
return 0, with online == false.

If that is the case ata_do_reset() would return 0, and
ata_eh_followup_srst_needed() (returns true only if -EAGAIN) would return false.

Which should eventually cause us to retry another hard reset, as long as
tries <= max_tries.


By making sata_link_hardreset() return -EAGAIN, the difference I see is that
we will for a software reset followed by the hardreset, but you commit message
did not mention that.

So, my question is, why is it not sufficient to retry another
hardreset/COMRESET?

Does it work to only do a hardreset (without if a follow up softreset) if you
increase the deadline?

Latest update: The issue we are currently facing is that the SATA host side is continuously waiting for the disk to return the COMWAKE primitive. From the SATA analyzer, it can be seen that the disk has already returned the COMWAKE primitive. This may be related to poor COMWAKE signal quality, causing the host to fail to recognize it. Performing a hard reset might be able to fix the issue.

Thanks
Xingui