Re: [PATCH v2] ata: libata-scsi: Fix get identity data failed

From: Damien Le Moal
Date: Mon May 22 2023 - 03:05:52 EST


On 5/22/23 10:35, Damien Le Moal wrote:
> On 5/8/23 10:11, yangxingui wrote:
>>
>>
>> On 2023/5/7 22:51, Damien Le Moal wrote:
>>> On 2023/05/05 18:06, yangxingui wrote:
>>>>
>>>>
>>>> On 2023/5/5 16:17, Damien Le Moal wrote:
>>>>> On 2023/05/05 11:57, Xingui Yang wrote:
>>>>>> The function ata_get_identity() uses the helper ata_scsi_find_dev() to get
>>>>>> the ata_device structure of a scsi device. However, when the ata device is
>>>>>> managed by libsas, ata_scsi_find_dev() returns NULL, turning
>>>>>> ata_get_identity() into a nop and always returns -ENOMSG.
>>>>>
>>>>> What do you do to hit the issue ? A while back for me it was the queue depth
>>>>> setting causing problems. As Garry mentioned, this led to patch 141f3d6256e5
>>>>> ("ata: libata-sata: Fix device queue depth control").
>>>> Attempt to return the correct value at ata_scsi_find_dev() instead of
>>>> NULL, when the ata device is managed by libsas?
>>>
>>> That I understand. My question is *what* user operation/command triggers this ?
>>> Because on my test setup, under normal use, I do not see this issue (beside what
>>> was already corrected with the queue depth control). Is the issue showing up
>>> when using passthrough commands only ?
>> Yeah, we found that command "hdparm -i /dev/sdc" always return faild for
>> SATA HDD disk. as follows:
>> [root@localhost ~]# hdparm -i /dev/sdc
>>
>> /dev/sdc:
>> HDIO_GET_IDENTITY failed: Invalid argument
>
> I cannot recreate this issue exactly like this. Here is my setup with a pm80xx
> driver (Adaptec HBA):
>
> [7:0:0:0] disk ATA WDC WUH721818AL W232 /dev/sdd /dev/sg5
> [7:0:1:0] disk ATA WDC WUH721818AL WTW2 /dev/sdi /dev/sg6
> [7:0:2:0] disk ATA WDC WUH722222AL Wf86 /dev/sdf /dev/sg7
> [7:0:3:0] zbc ATA WDC WSH722020AL W803 /dev/sdg /dev/sg8
>
> Using the first drive, I get:
>
> sudo hdparm -i /dev/sdd
>
> /dev/sdd:
>
> Model=WDC WUH721818ALN604, FwRev=PCGNW232, SerialNo=3KG10LBK
> Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
> RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=56
> BuffType=DualPortCache, BuffSize=unknown, MaxMultSect=2, MultSect=off
> CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=4394582016
> IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
> PIO modes: pio0 pio1 pio2 pio3 pio4
> DMA modes: mdma0 mdma1 mdma2
> UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
> AdvancedPM=yes: disabled (255) WriteCache=enabled
> Drive conforms to: unknown: ATA/ATAPI-2,3,4,5,6,7
>
> * signifies the current active mode
>
> So all good. However, for the following drives, I get:
>
> sudo hdparm -i /dev/sdi
>
> /dev/sdi:
> HDIO_GET_IDENTITY failed: No message of desired type
>
> (same for sdf and sdg).
>
> Will dig into this.

OK, so the issue is that __ata_scsi_find_dev() calls ata_find_dev() with devno
== scsidev->id. This leads to devno being 0, 1, 2 and 3 for connected drives
sdd, sd1, sdf and sdg, as shown by lsscsi. However, each drive has its own
port+link, with the link for each one having ata_link_max_devices() == 1, so
ata_find_dev() works only for the first drive with scsidev->id == 0 and fails
for the others. A naive fix would be this:

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 7bb12deab70c..e4d6f17d7ccc 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -2718,7 +2718,7 @@ static struct ata_device *__ata_scsi_find_dev(struct
ata_port *ap,
if (!sata_pmp_attached(ap)) {
if (unlikely(scsidev->channel || scsidev->lun))
return NULL;
- devno = scsidev->id;
+ devno = 0;
} else {
if (unlikely(scsidev->id || scsidev->lun))
return NULL;

And running this on my setup, it works. This makes libsas added ports/devices
look like AHCI ones, where all devices have ID 0 for the !pmp case.

However, I am not sure this would be OK for all setups...

John,

Any idea if there is any cases where libsas managed drives would endup not being
correctly identified by this change ? As long as a device always has its own
port, I do not see any issue. But is there a case where we could have multiple
devices on the same port ? Per libata, max is 2, and that is only for the IDE
master/slave case. Otherwise, it is always 1.

Not that looking at the pmp case, I am not confident at all that the
identification is correct for libsas. But I do not think that anyone would ever
connect a pmp box to a libsas HBA...

--
Damien Le Moal
Western Digital Research