Re: Is BIO_RW_FAILFAST really usable?

From: Jens Axboe
Date: Tue Dec 04 2007 - 04:15:53 EST


On Tue, Dec 04 2007, Neil Brown wrote:
>
> I've been looking at use BIO_RW_FAILFAST in md/raid to improve
> handling of some error cases.
>
> This is particularly significant for the DASD driver (s390 specific).
> I believe it uses optic fibre to connect to the drives. When one of
> these paths is unplugged, IO requests will block until an operator
> runs a command to reset the card (or until it is plugged back in).
> The only way to avoid this blockage is to use BIO_RW_FAILFAST. So
> we really need BIO_RW_FAILFAST for a reliable RAID1 configuration on
> DASD drives.
>
> However, I just tested BIO_RW_FAILFAST on my SATA drives: controller
>
> 02:06.0 RAID bus controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02)
>
> (not using the cards minimal RAID functionality) and requests fail
> immediately and always with e.g.
>
> sd 2:0:0:0: [sdc] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> end_request: I/O error, dev sdc, sector 2048
>
> So fail fast obviously isn't generally usable.
>
> What is the answer here? Is the Silicon Image driver doing the wrong
> thing, or is DASD doing the wrong thing, or is BIO_RW_FAILFAST
> under-specified and we really need multiple flags or what?

Hrmpf. It looks like the SCSI layer is a little too trigger happy. Any
chance you could try and trace where this happens?

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/