Re: [RFC PATCH v1 0/7] Block/XFS: Support alternative mirror device retry

From: Bob Liu
Date: Sat Dec 08 2018 - 09:50:38 EST

On 11/28/18 3:45 PM, Christoph Hellwig wrote:
> On Wed, Nov 28, 2018 at 04:33:03PM +1100, Dave Chinner wrote:
>> - how does propagation through stacked layers work?
> The only way it works is by each layering driving it. Thus my
> recommendation above bilding on your earlier one to use an index
> that is filled by the driver at I/O completion time.
> E.g.
> bio_init: bi_leg = -1
> raid1: submit bio to lower driver
> raid 1 completion: set bi_leg to 0 or 1
> Now if we want to allow stacking we need to save/restore bi_leg
> before submitting to the underlying device. Which is possible,
> but quite a bit of work in the drivers.

I found it's still very challenge while writing the code.
save/restore bi_leg may not enough because the drivers don't know how to do fs-metadata verify.

E.g two layer raid1 stacking

fs: md0(copies:2)
/ \
layer1/raid1 md1(copies:2) md2(copies:2)
/ \ / \
layer2/raid1 dev0 dev1 dev2 dev3

Assume dev2 is corrupted
=> md2: don't know how to do fs-metadata verify.
=> md0: fs verify fail, retry md1(preserve md2).
Then md2 will never be retried even dev3 may also has the right copy.
Unless the upper layer device(md0) can know the amount of copy is 4 instead of 2?
And need a way to handle the mapping.
Did I miss something? Thanks!


>> - is it generic/abstract enough to be able to work with
>> RAID5/6 to trigger verification/recovery from the parity
>> information in the stripe?
> If we get the non -1 bi_leg for paritity raid this is an inidicator
> that parity rebuild needs to happen. For multi-parity setups we could
> also use different levels there.