Re: [PATCH 009 of 11] md: Support stripe/offset mode in raid10

From: Al Boldi
Date: Mon May 08 2006 - 13:01:16 EST


Neil Brown wrote:
> On Wednesday May 3, a1426z@xxxxxxxxx wrote:
> > Neil Brown wrote:
> > > On Tuesday May 2, a1426z@xxxxxxxxx wrote:
> > > > NeilBrown wrote:
> > > > > The "industry standard" DDF format allows for a stripe/offset
> > > > > layout where data is duplicated on different stripes. e.g.
> > > > >
> > > > > A B C D
> > > > > D A B C
> > > > > E F G H
> > > > > H E F G
> > > > >
> > > > > (columns are drives, rows are stripes, LETTERS are chunks of
> > > > > data).
> > > >
> > > > Presumably, this is the case for --layout=f2 ?
> > >
> > > Almost. mdadm doesn't support this layout yet.
> > > 'f2' is a similar layout, but the offset stripes are a lot further
> > > down the drives.
> > > It will possibly be called 'o2' or 'offset2'.
> > >
> > > > If so, would --layout=f4 result in a 4-mirror/striped array?
> > >
> > > o4 on a 4 drive array would be
> > >
> > > A B C D
> > > D A B C
> > > C D A B
> > > B C D A
> > > E F G H
> > > ....
> >
> > Yes, so would this give us 4 physically duplicate mirrors?
>
> It would give 4 devices each containing the same data, but in a
> different layout - much as the picture shows....
>
> > If not, would it be possible to add a far-offset mode to yield such
> > a layout?
>
> Exactly what sort of layout do you want?

Something like this:

o1 A1 B1 C1 D1

o2 A1 A2 B1 B2
C2 C1 D2 D1

o3 A1 A2 A3 B1
B2 B3 C1 C2
C3 D1 D2 D3

o4 A1 A2 A3 A4
B2 B3 B4 B1
C3 C4 C1 C2
D4 D1 D2 D3

(columns are drives, numbers are stripes, LETTERS are chunks of data).

> > > > Also, would it be possible to have a staged write-back mechanism
> > > > across multiple stripes?
> > >
> > > What exactly would that mean?
> >
> > Write the first stripe, then write subsequent duplicate stripes based on
> > idle with a max delay for each delayed stripe.
> >
> > > And what would be the advantage?
> >
> > Faster burst writes, probably.
>
> I still don't get what you are after.
> You always need to wait for writes of all copies to complete before
> acknowledging the write to the filesystem, otherwise you risk
> corruption if there is a crash and a device failure.

Yes, some people can not stomach running degraded at any time, so they could
set the delay for the first duplicate stripe to 0, and subsequent stripes at
leasure.

Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/