Jeff Garzik wrote:Nick Piggin wrote:Anyway, the idea of making fsync/fdatasync etc. safe by default isAgreed... it's also disappointing that [unless I'm mistaken] you have to hack each filesystem to support barriers.
a good idea IMO, and is a bad bug that we don't do that :(
It seems far easier to make sync_blkdev() Do The Right Thing, and magically make all filesystems data-safe.
Well, you need ordered metadata writes, barriers _and_ flushes with
some filesystems.
Merely writing all the data pages than issuing a drive cache flush
won't Do The Right Thing with those filesystems - someone already
mentioned Btrfs, where it won't.
But I agree that your suggestion would make a superb default, for
filesystems which don't provide their own function.
It's not optimal even then.
Devices: On a software RAID, you ideally don't want to issue flushes
to all drives if your database did a 1 block commit entry. (But they
probably use O_DIRECT anyway, changing the rules again). But all that
can be optimised in generic VFS code eventually. It doesn't need
filesystem assistance in most cases.