Re: [PATCH] SCSI disk won't spinup, so boot hangs

From: Paul Slootman
Date: Tue Jul 18 2006 - 08:45:22 EST


Erik Mouw <erik@xxxxxxxxxxxxxxxxxxxxx> wrote:
>On Wed, Jul 12, 2006 at 04:14:41PM +0200, Paul Slootman wrote:
>> I was recently confronted by a SCSI disk that had crashed somehow
>> ("mechanical positioning error" during the first signs of trouble). The
>> kernel hung after that, despite the fact that everything was on RAID-1
>> and the first disk still was working...
>
>Sounds like a worn bearing or a head crash.

Probably a head crash, this was a 5 month old 15krpm 73GB scsi disk.

>> I reasoned that if the spinup fails, it doesn't make much sense to try
>> and read the capacity, the partition tables, etc.. Hence I came up with
>> the patch below that sets media_present to 0 when the spinup doesn't
>> respond. It works for me (TM); there may be a better way, however the
>> current behaviour sucks big time.
>
>I've seen disks where the 100s timeout is not enough and which only
>came into a useful state after 10 minutes or even longer. Your patch

I'd consider disks that take 10 minutes or more to become ready
extremely obsolete :-)

>would make those disks useless, even though they can be used. Also I'm

Not really, wait your 10 minutes, echo "scsi remove-single-device a b c
d" > /proc/scsi/scsi; echo "scsi add-single-device a b c d" >
/proc/scsi/scsi; and there it is.

>not sure your patch is the right approach: a disk doesn't have
>removable media and yet it reports that there's no media present.

For all intents and purposes, the media isn't there...
Do you really think that the current behaviour (wait until eternity for
the drive to become ready on read capacity) to be the correct behaviour?
If so, why bother with the delay (of max. 100s) until spinup? Simply
issue the spinup command, and start trying to read the capacity would be
enough then.


Paul Slootman

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/