Re: Linux 2.6.29

From: Ric Wheeler
Date: Wed Mar 25 2009 - 17:01:17 EST


Linus Torvalds wrote:
On Wed, 25 Mar 2009, Jeff Garzik wrote:
It is clearly possible to implement an fsync(2) that causes FLUSH CACHE to be
issued, without adding full barrier support to a filesystem. It is likely
doable to avoid touching per-filesystem code at all, if we issue the flush
from a generic fsync(2) code path in the kernel.

We could easily do that. It would even work for most cases. The problematic ones are where filesystems do their own disk management, but I guess those people can do their own fsync() management too.

One concern with doing this above the file system is that you are not in the context of a transaction so you have no clean promises about what is on disk and persistent when. Flushing the cache is primitive at best, but the way barriers work today is designed to give the transactions some pretty critical ordering semantics for journalling file systems at least.

I don't see how you could use this approach to make a really robust, failure proof storage system, but it might appear to work most of the time for most people :-)

ric

Somebody send me the patch, we can try it out.

Remember, fsync(2) means that the user _expects_ a performance hit.

Within reason, though.

OS X, for example, doesn't do the disk barrier. It requires you to do a separate FULL_FSYNC (or something similar) ioctl to get that. Apparently exactly because users don't expect quite _that_ big of a performance hit.

(Or maybe just because it was easier to do that way. Never attribute to malice what can be sufficiently explained by stupidity).

Linus



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/