Re: [rfc] Ignore Fsync Calls in Laptop_Mode

From: david
Date: Sun May 29 2011 - 21:54:01 EST


On Sun, 29 May 2011, D. Jansen wrote:

On Fri, May 27, 2011 at 4:17 PM, Theodore Tso <tytso@xxxxxxx> wrote:
On May 27, 2011, at 3:12 AM, D. Jansen wrote:
That reordering is exactly what I'm talking about. It wasn't my idea.
But if I understood it correctly, it's possible that the kernel
commits writes of an application, _to one and the same file_, in a
non-FIFO order, if the application does not fsync. And this _afaiu_
could result in the loss not only of new data, but complete corruption
of previously existing data in laptop mode without fsync.

No, you're not understanding the problem. Â All layers of the storage
stack -- including the hard drive -- is allowed to reorder writes. ÂSo
even if the kernel sends data to the disk in the exact same order that
the application wrote it, it could still get written in a different order,
because the hard drive itself can reorder writes. Â This is necessary
for performance; if you didn't have this, the storage stack would be
dog slow, and would consume even more power.

So at least level, the only thing you can count upon is that if you want
to make sure everything is flushed to stable store, you need to send
an fsync() command at the application to file system level, or a barrier
or flush command at the OS to hard drive level.
(...)
Ordering doesn't matter, because nothing, including the hard drive,
guarantees ordering. ÂWhat does matter is that the fsync() commands
act like barriers; writes before the fsync() command are guaranteed
to be written to the disk, and survive a reboot, before any writes after
the fsync() are processed. ÂSee?

Ok, thanks a lot! I understand a lot better now!
So we can't live without the fsyncs.

So what if we would queue the fsyncs along with the writes - we would
just fsync later instead of immediately, in between the writes as they
came in. Then by design previous data could not be corrupted, right?
We would do exactly the same thing, just later.
It'd be kind of a disk write time distortion field.

the problem is that the spec for fsync says that your program stops until fsync finishes. If you don't do that then you will corrupt and loose data.

so if you delay fsync you will have your application (or desktop manager) freeze until the fsync completes.

if what you are wanting is the ability to say 'these things must be written before these other things to keep them from being corrupted, but I don't care when they get written (or if they get lost in a crash)' then what you want isn't fsync, it's a barrier.

David Lang