Crash safety semantics of mmap(), rename(), fsync(), msync()
From: Alex Bligh
Date: Sat Jul 23 2011 - 07:06:36 EST
I have a question about safety of mmap, rename and fync()/msync().
I am creating a memory mapped file with a particular name, and I want to
ensure the following in the event of a system crash / power outage /
whatever:
A. If the file exists on disk with its final name at any stage, it MUST
contain all the data I have written to it (i.e. it must never exist on disk
with different data and this name)
B. At the exit from the function concerned, the file must exist with its
final name, and must contain that data on disk. The fd must still e open
and the mmap in place (it's used to read from elsewhere).
What I am doing to achieve this is:
1. open (O_CREAT) to a temporary file name
2. ftruncate() the file to required length
3. mmap() the file
4. write the data to the mmap'd area
5. msync() the whole area to ensure the data is written
6. fsync() the file to ensure the metadata is written (e.g. the creation of
the file in the first place and the extension of the file by ftruncate())
7. rename() the file to the required file name
8. fsync() the file again, to ensure the rename is written to disk to
satisfy criteria B above)
This is all quite time critical. I can pretty much choose the fie system
but would prefer something like ext4. The file is between 1MB and 8MB in
size if that matters any.
The question I have is this: Is it really necessary to msync() and fsync()
twice? Can I get away without (e.g.) the stage 6 msync? Or, without it,
might a crash immediately after the rename() result in a file that has the
permanent name, but the wrong metadata? (I only care about file name and
recorded length I think). Or is rename() guaranteed to write out all
metadata (e.g. file length), in which case can I drop both fsync()s? Or are
the fsync()s guaranteed to be cheap if they do nothing?
--
Alex Bligh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/