Re: [RFC PATCH] fpathconf() for fsync() behavior
From: Andrew Morton
Date: Thu Apr 23 2009 - 01:23:00 EST
On Wed, 22 Apr 2009 20:12:57 -0400 Valerie Aurora Henson <vaurora@xxxxxxxxxx> wrote:
> In the default mode for ext3 and btrfs, fsync() is both slow and
> unnecessary for some important application use cases - at the same
> time that it is absolutely required for correctness for other modes of
> ext3, ext4, XFS, etc. If applications could easilyl distinguish
> between the two cases, they would be more likely to be correct and
> fast.
>
> How about an fpathconf() variable, something like _PC_ORDERED? E.g.:
>
> /* Unoptimized example optional fsync() demo */
> write(fd);
> /* Only fsync() if we need it */
> if (fpath_conf(fd, _PC_ORDERED) != 1)
> fsync(fd);
> rename(tmp_path, new_path);
>
> I know of two specific real-world cases in which this would
> significantly improve performance: (a) fsync() before rename(), (b)
> fsync() of the parent directory of a newly created file. Case (b) is
> particularly nasty when you have multiple threads creating files in
> the same directory because the dir's i_mutex is held across fsync() -
> file creates become limited to the speed of sequential fsync()s.
>
> Conceptual libc patch below.
Would it be better to implement new syscall(s) with finer-grained control
and better semantics? Then userspace would just need to to:
fsync_on_steroids(fd, FSYNC_BEFORE_RENAME);
and that all gets down into the filesystem which can then work out what
it needs to do to implement the command.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/