Re: raid is dangerous but that's secret (was Re: [patch] ext2/3:document conditions when reliable operation is possible)

From: Ric Wheeler
Date: Mon Aug 31 2009 - 09:22:31 EST


On 08/31/2009 09:16 AM, Christoph Hellwig wrote:
On Mon, Aug 31, 2009 at 09:15:27AM -0400, Ric Wheeler wrote:
While most common filesystem do have barrier support it is:

- not actually enabled for the two most common filesystems
- the support for write barriers an cache flushing tends to be buggy
all over our software stack,


Or just missing - I think that MD5/6 simply drop the requests at present.

I wonder if it would be worth having MD probe for write cache enabled&
warn if barriers are not supported?

In my opinion even that is too weak. We know how to control the cache
settings on all common disks (that is scsi and ata), so we should always
disable the write cache unless we know that the whole stack (filesystem,
raid, volume managers) supports barriers. And even then we should make
sure the filesystems does actually use barriers everywhere that's needed
which failed at for years.


I was thinking about that as well. Having us disable the write cache when we know it is not supported (like in the MD5 case) would certainly be *much* safer for almost everyone.

We would need to have a way to override the write cache disabling for people who either know that they have a non-volatile write cache (unlikely as it would probably be to put MD5 on top of a hardware RAID/external array, but some of the new SSD's claim to have non-volatile write cache).

It would also be very useful to have all of our top tier file systems enable barriers by default, provide consistent barrier on/off mount options and log a nice warning when not enabled....

ric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/