Re: raid is dangerous but that's secret (was Re: [patch] ext2/3: document conditions when reliable operation is possible)

From: NeilBrown
Date: Fri Aug 28 2009 - 03:34:30 EST


On Fri, August 28, 2009 4:44 pm, Pavel Machek wrote:
> On Thu 2009-08-27 21:32:49, Ric Wheeler wrote:
>>>
>> If you have a specific bug in MD code, please propose a patch.
>
> Interesting. So, what's technically wrong with the patch below?
>

You mean apart from ".... that high highly undesirable ...." ??
^^^^^^^^^^^

And the phrase "Regular backups when using these devices ...." should
be "Regular backups when using any devices .....".
^^^
If you have a device failure near a power fail on a raid5 you might
lose some blocks of data. If you have a device failure near (or not
near) a power failure on raid0 or jbod etc you will certainly lose lots
of blocks of data.

I think it would be better to say:

".... and degraded DM/MD RAID 4/5/6(*) arrays..."
^^^^^^^^
with
(*) If device failure causes the array to become degraded during or
immediately after the power failure, the same problem can result.

And "necessary" only have the one 'c' :-)

NeilBrown

> Pavel
> ---
>
> From: Theodore Tso <tytso@xxxxxxx>
>
> Document that many devices are too broken for filesystems to protect
> data in case of powerfail.
>
> Signed-of-by: Pavel Machek <pavel@xxxxxx>
>
> diff --git a/Documentation/filesystems/dangers.txt
> b/Documentation/filesystems/dangers.txt
> new file mode 100644
> index 0000000..2f3eec1
> --- /dev/null
> +++ b/Documentation/filesystems/dangers.txt
> @@ -0,0 +1,21 @@
> +There are storage devices that high highly undesirable properties when
> +they are disconnected or suffer power failures while writes are in
> +progress; such devices include flash devices and DM/MD RAID 4/5/6 (*)
> +arrays. These devices have the property of potentially corrupting
> +blocks being written at the time of the power failure, and worse yet,
> +amplifying the region where blocks are corrupted such that additional
> +sectors are also damaged during the power failure.
> +
> +Users who use such storage devices are well advised take
> +countermeasures, such as the use of Uninterruptible Power Supplies,
> +and making sure the flash device is not hot-unplugged while the device
> +is being used. Regular backups when using these devices is also a
> +Very Good Idea.
> +
> +Otherwise, file systems placed on these devices can suffer silent data
> +and file system corruption. An forced use of fsck may detect metadata
> +corruption resulting in file system corruption, but will not suffice
> +to detect data corruption.
> +
> +(*) Degraded array or single disk failure "near" the powerfail is
> +neccessary for this property of RAID arrays to bite.
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/