Re: How to know when file data has been flushed into disk?

From: Zach Brown
Date: Fri Apr 07 2006 - 13:55:36 EST



> If a program access data like this:
>
> 1. open the file
> 2. write a lot of data into this file

You don't say if this is an extending write or overwriting existing file
data. I'm going to assume extending writes so that data=ordered kicks in.

> 3. close the file

> So my questions are:
> 1. How will the file system be notified after all data has been
> flushed into disk?

Look at phase 2 in journal_commit_transaction(). The kjournald thread
issues the writeback of the file data by walking t_sync_datalist and
then waits for the writeback to complete by using wait_on_buffer()
before committing the transaction.

> 2. Unlike data=journal mode, in data=order mode, the data could be
> lost if system crashes when data is being flushed to disk. When system
> reboots, does journal contains the old meta data for undo?

No, ext3 isn't roll-backward. It doesn't store the *old* data in the
journal and undo the change if it fails halfway through. It's
roll-forward. It stores the *new* data in the journal and replays
complete transactions in the journal that weren't moved out to their
final place on disk at the time of the crash.

So if the machine reboots during the writeback phase then the
transaction won't be committed yet and recovery won't replay that
transaction from the journal. From the metadata's point of view the
file extension will never have happened.

> 3. Does sys_close() have to be blocked until all data and metadata
> are committed?

No, and neither does sys_getpid() :)

> to take subsequent operation. However, data flush could be failed. In
> this case, file system seems to mislead the application. Is this true?

No. The application has no grounds for assuming that a successful
close() has synced previous operations to disk. It's simply not part of
the API.

> If so, any solutions?

The application should rely on tools like fsync(), fdatasync(), O_SYNC,
mount -o sync, etc. Whatever suits it best.

- z
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/