Re: Strange write starvation on 2.6.17 (and other) kernels

From: Andrew Morton
Date: Tue Aug 15 2006 - 10:02:41 EST

On Tue, 15 Aug 2006 14:40:48 +0200 (CEST)
Grzegorz Kulewski <kangur@xxxxxxxxxx> wrote:

> On Tue, 15 Aug 2006, Andrew Morton wrote:
> > Which filesystem?
> >
> > If ext3 in ordered-data mode: any fsync() will sync the whole filesystem
> > (it has to).
> Could you explain some more why it has to?... Is this caused by design of
> ext3 or any filesystem in ordered-data mode has to do it?

The journal is a shared resource - a single swipe of disk which is written
(logically) atomically and which contains the metadata modifications for
many files. As filesystem activity proceeds we attach more and more
metadata blocks to the journal and when a commit happens we write them all
out in one hit.

Hence an fsync() of a single file ends up journalling the metadata for all
files which have pending metadata writes.

And in data=ordered mode the filesystem must write all the user-data for a
file before it writes the metadata which refers to that data.

IOW, a single fsync() triggers the journalling of all file metadata which
requires the writing of all file data.

In data=writeback mode the fs doesn't have the requirement that file
user-data be written prior to the journalling of the metadata which refers
to that data, so we can leave the pagecache for all the non-fsynced files
floating about in memory, still dirty.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at