Re: two physical drives, but two device files point to same drive

From: Matt Garman
Date: Wed Sep 01 2010 - 12:46:40 EST


Sorry to post this before I did more investigating. Turns out one of
the new Hitachi drives was bad.

Still, think it's a curious failure model though. If I pulled the
drive that was "doubly" recognized, then the bad drive isn't
recognized by Linux at all. If I pulled the bad drive, the good drive
is no longer doubly recognized (i.e. has exactly one device file).

Strange. But, replacing the broken drive with a good one makes things
look and act like they should.

Thanks,
Matt

On Wed, Sep 1, 2010 at 10:18 AM, Matt Garman <matthew.garman@xxxxxxxxx> wrote:
> I have a backup server with 25 total SATA drives.  I just swapped two
> drives for two new ones.  When I boot, there are two /dev/sdX device
> files, but both device files actually refer to the same physical
> drive.  The other physical drive does not have a device file and
> cannot be accessed.
>
> Here is a dmesg excerpt where the two drives are identified:
>
> mptsas: ioc0: attaching sata device: fw_channel 0, fw_id 21, phy 26, sas_addr 0x
> 50030480006cd85a
>  Vendor: ATA       Model: Hitachi HDS72202  Rev: A3EA
>  Type:   Direct-Access                      ANSI SCSI revision: 05
> SCSI device sdx: 3907029168 512-byte hdwr sectors (2000399 MB)
> sdx: Write Protect is off
> sdx: Mode Sense: 73 00 00 08
> SCSI device sdx: drive cache: write back
> SCSI device sdx: 3907029168 512-byte hdwr sectors (2000399 MB)
> sdx: Write Protect is off
> sdx: Mode Sense: 73 00 00 08
> SCSI device sdx: drive cache: write back
>  sdx: sdx1
> sd 4:0:22:0: Attached scsi disk sdx
> mptsas: ioc0: attaching sata device: fw_channel 0, fw_id 21, phy 27,
> sas_addr 0x50030480006cd85b
>  Vendor: ATA       Model: Hitachi HDS72202  Rev: A3EA
>  Type:   Direct-Access                      ANSI SCSI revision: 05
> SCSI device sdy: 3907029168 512-byte hdwr sectors (2000399 MB)
> sdy: Write Protect is off
> sdy: Mode Sense: 73 00 00 08
> SCSI device sdy: drive cache: write back
> SCSI device sdy: 3907029168 512-byte hdwr sectors (2000399 MB)
> sdy: Write Protect is off
> sdy: Mode Sense: 73 00 00 08
> SCSI device sdy: drive cache: write back
>  sdy: sdy1
> sd 4:0:23:0: Attached scsi disk sdy
>
> So the two drives are supposed to be sdx and sdy.
>
> Here is an excerpt from "smartctl -a /dev/sdx":
>
> Device Model:     Hitachi HDS722020ALA330
> Serial Number:    JK11A1YAJHEX7V
> Firmware Version: JKAOA3EA
> User Capacity:    2,000,398,934,016 bytes
>
>
> Likewise, "smartctl -a /dev/sdy":
>
> Device Model:     Hitachi HDS722020ALA330
> Serial Number:    JK11A1YAJHEX7V
> Firmware Version: JKAOA3EA
> User Capacity:    2,000,398,934,016 bytes
>
> As you can see, the serial numbers are the same.
>
> Furthermore, the drives are both brand new, and come from the factory
> without any partition table.  I created a partition table and ext3
> filesystem on sdx.  When I started fdisk for sdy, a partition already
> existed.  I went ahead and tried to create an ext3 filesystem on sdy
> as well, but after creating the inode blocks, it failed, saying, "file
> already exists".
>
> The drives are attached to an LSI Logic SASX36 expander/backplane.
> The SAS controller is: LSI Logic / Symbios Logic SAS1068E PCI-Express
> Fusion-MPT SAS.
>
> The kernel is from RHEL/CentOS: 2.6.18-194.3.1.el5
>
> With the exception of these two new Hitachi drives, I have been using
> this hardware for several months, with other drives connected to the
> exact same expander ports.  The only difference is that this is the
> first time I've used Hitachi drives.  Is it possible they are not
> compatible with the expander?  Or is it a kernel/driver bug (for
> either the SAS controller or expander)?
>
> Thanks,
> Matt
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/