Re: CD writing in future Linux (stirring up a hornets' nest)

From: John Stoffel
Date: Wed Feb 15 2006 - 22:07:54 EST


>>>>> "Rob" == Rob Landley <rob@xxxxxxxxxxx> writes:

Rob> Yup. Apparently with SAS, the controllers are far more likely to
Rob> fail than the drives.

While a single drive is more likely to fail when compared to a single
controller, for a truly redundant system you want no single point of
failure, which means redundant controllers is a requirement.

>> Makes redundant systems much simpler to build if you can connect
>> each physical drive to two places at once.

Rob> Or you could use raid and get complete redundancy rather than two
Rob> paths to the same single point of failure. Your choice.

Excuse me? Think about what you just wrote here and what you're
implying.

Of course you would use RAID here, along with two controllers and two
paths to the single disk. But you'd also have multiple disks here as
well. Not a single disk and two controllers and consider that
reliable.

>> They support port expanders (which SATA seems to be starting to
>> support although more limited).

Rob> I still don't see why drives are expected to be more reliable
Rob> than controllers.

He never said they were.

Rob> I think the most paranoid setup I've seen was six disks holding
Rob> two disks worth of information. A three way raid-5, mirrored.
Rob> It could lose any three disks out of the group, and several 4
Rob> disk combinations. If six SATA drives are cheaper than two SAS
Rob> drives. (Yeah, the CRC calculation eats CPU and flushes your
Rob> cache. So what?)

And how many controllers could that setup lose? You need to think of
the whole path, not just the disks at the ends, when you are planning
for reliability (and performance as well).

Also, with dual ports on a drive, it becomes much easier to build two
machine clusters which both can see all the drives shared between the
clusters. Just like SCSI (old, original 5MB/S scsi) where you changed
hte ID of one of the initiators. Not done frequently, but certainly
done alot with VMS/VAX clusters.

Rob> I keep thinking there should be something more useful you could
Rob> do than "hot spare" with extra disks in simple RAID 5, some way
Rob> of dynamically scaling more parity info. But it's not an area I
Rob> play in much...

RAID6, or as NetApp calls it, Dual Parity. You can lose any TWO disks
in a raid group and still be working. It covers to more common single
disk fails, and then you still have full parity coverage if another
disk fails during the re-build of the parity info onto the spare
drive.

With 250Gb disks, that run a 50MB/S, it takes a LONG time to actually
sweep though all the data and rebuild the parity. 24 hours or more.
So to cover your butt, you'd like to have even more redundancy.

I've run fully mirrored servers, where I had redundant paths to each
disk from each controller. When I lost a controller, which certainly
happened, I didn't lose any data, nor disk I lose mirroring either.
Very nice.

In the situtations where I only had one controller per set of disks,
and mirrored between controlles, losing a controller meant I had to
re-mirror things once they got running again, but I didn't lose data.
Very nice.

Building reliable disk storage is not cheap. Fast, reliable, cheap.
Pick any two. :]

John
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/