Re: Journaling pointless with today's hard disks?

From: Stephen Satchell (satch@concentric.net)
Date: Sat Nov 24 2001 - 18:23:30 EST


At 02:28 PM 11/24/01 -0800, H. Peter Anvin wrote:
> > > However, if it's really true that DTLA drives and their successor
> > > corrupt blocks (generate bad blocks) on power loss during block writes,
> > > these drives are crap.
> >
> > They do, even IBM admits that (on
> >
> > http://www.cooling-solutions.de/dtla-faq
> >
> > you find a quote from IBM confirming this). IBM says it's okay, you
> > have to expect this to happen. So much for their expertise in making
> > hard disks. This makes me feel rather dizzy (lots of IBM drives in
> > use).
> >
>
>No sh*t. I have always been favouring IBM drives, and I had a RAID
>system with these drives bought. It will be a LONG time before I buy
>another IBM drive, that's for sure. I can't believe they don't even
>have the decency of saying "we fucked".

It is the responsibility of the power monitor to detect a power-fail event
and tell the drive(s) that a power-fail event is occurring. If power goes
out of specification before the drive completes a commanded write, what do
you expect the poor drive to do? ANY glitch in the write current will
corrupt the current block no matter what -- the final CRC isn't
recorded. Most drives do have a panic-stop mode when they detect voltage
going out of range so as to minimize the damage caused by an
out-of-specification power-down event, and more importantly use the energy
in the spinning platter to get the heads moved to a safe place before the
drive completely spins down. The panic-stop mode is EXACTLY like a Linux
OOPS -- it's a catastrophic event that SHOULD NOT OCCUR.

Most power supplies are not designed to hold up for more than 30-60 ms at
full load upon removal of mains power. Power-fail detect typically
requires 12 ms (three-quarters cycle average at 60 Hz) or 15 ms
(three-quarters cycle average at 50 Hz) to detect that mains power has
failed, leaving your system a very short time to abort that long queue of
disk write commands. It's very possible that by the time the system wakes
up to the fact that its electron feeding tube is empty it has already
started a write operation that cannot be completed before power goes out of
specification. It's a race condition.

Fix your system.

If you don't have a UPS on that RAID, and some means of shutting down the
RAID gracefully when mains power goes, you are sealing your own doom,
regardless of the maker of the hard drive you use in that RAID. Even the
original CDC disk drives, some of the best damn drives ever manufactured in
the world, would corrupt data when power failed during a write.

Satch

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Nov 30 2001 - 21:00:17 EST