Re: [sqlite] light weight write barriers

From: Vladislav Bolkhovitin
Date: Mon Nov 12 2012 - 22:41:13 EST



Alan Cox, on 11/02/2012 08:33 AM wrote:
b) most drives will internally re-order requests anyway

They will but only as permitted by the commands queued, so you have some
control depending upon the interface capabilities.

c) cheap drives won't support barriers

Barriers are pretty much universal as you need them for power off !

I'm afraid, no storage (drives, if you like this term more) at the moment supports barriers and, as far as I know the storage history, has never supported.

Instead, what storage does support in this area are:

1. Cache flushing facilities: FUA, SYNCHRONIZE CACHE, etc.

2. Commands ordering facilities: commands attributes (ORDERED, SIMPLE, etc.), ACA, etc.

3. Atomic commands, e.g. scattered writes, which allow to write data in several separate not adjacent blocks in an atomic manner, i.e. guarantee that either all blocks are written or none at all. This is a relatively new functionality, natural for flash storage with its COW internals.

Obviously, using such atomic write commands, an application or a file system don't need any journaling anymore. FusionIO reported that after they modified MySQL to use them, they had 50% performance increase.


Note, that those 3 facilities are ORTHOGONAL, i.e. can be used independently, including on the same request. That is the root cause why barrier concept is so evil. If you specify a barrier, how can you say what kind actual action you really want from the storage: cache flush? Or ordered write? Or both?

This is why relatively recent removal of barriers from the Linux kernel (http://lwn.net/Articles/400541/) was a big step ahead. The next logical step should be to allow ORDERED attribute for requests be accelerated by ORDERED commands of the storage, if it supports them. If not, fall back to the existing queue draining.

Actually, I'm wondering, why barriers concept is so sticky in the Linux world? A simple Google search shows that only Linux uses this concept for storage. And 2 years passed, since they were removed from the kernel, but people still discuss barriers as if they are here.

Vlad
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/