More RAID / SATA / barrier problems [ Re: 2.6.17-mm5 ]

From: Neil Brown
Date: Sat Jul 01 2006 - 18:27:41 EST


On Saturday July 1, akpm@xxxxxxxx wrote:
>
> mddev->pers is NULL in md_error(), so the test of
> !mddev->pers->error_handler oopsed. Perhaps this is a real MD bug which is
> now being exposed by the new barrier-handling problem.
>

Yes, this is a real MD bug which would hit whenever writing a
superblock fails during array-shutdown. I guess that has never
happened before! The work around you propose is probably as good as
any, but I'll think through it some more and see.

It seems that super block writes are always failing in some
configurations at the moment!

I wonder what we *should* do when writing to the superblock on the
last device of a raid1 faills... maybe switch the array to read-only?
I'll have a think about that too.

But we need to find out why barrier-writes are returning EIO.
Hopefully Reuben's testing will shed some light.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/