Re: adding proper O_SYNC/O_DSYNC, was Re: O_DIRECT and barriers
From: Jamie Lokier
Date: Fri Aug 28 2009 - 12:44:47 EST
Christoph Hellwig wrote:
> On Thu, Aug 27, 2009 at 10:24:28AM -0700, Ulrich Drepper wrote:
> > The problem with O_* extensions is that the syscall doesn't fail if the
> > flag is not handled. This is a problem in the open implementation which
> > can only be fixed with a new syscall.
> >
> > Why cannot just go on and say we interpret O_SYNC like O_SYNC and
> > O_SYNC|O_DSYNC like O_DSYNC. The POSIX spec explicitly requires that
> > the latter handled like O_SYNC.
> >
> > We could handle it by allocating two bits, only one is handled in the
> > kernel. If the O_DSYNC definition for userlevel would be different from
> > the kernel definition then the kernel could interpret O_SYNC|O_DSYNC
> > like O_DSYNC. The libc would then have to translate the userlevel
> > O_DSYNC into the kernel O_DSYNC. If the libc is too old for the kernel
> > and the application, the userlevel flag would be passed to the kernel
> > and nothing bad happens.
>
> What about hte following variant:
>
> - given that our current O_SYNC really is and always has been actuall
> Posix O_DSYNC keep the numerical value and rename it to O_DSYNC in
> the headers.
> - Add a new O_SYNC definition:
>
> #define O_SYNC (O_DSYNC|O_REALLY_SYNC)
>
> and do full O_SYNC handling in new kernels if O_REALLY_SYNC is
> present.
That looks good for the kernel.
However, for userspace, there's an issue with applications which were
compiled with an old libc and used O_SYNC. Most of them probably
expected O_SYNC behaviour but all they got was O_DSYNC, because Linux
didn't do it right.
But they *didn't know* that.
When using a newer kernel which actually implements O_SYNC behaviour,
I'm thinking those applications which asked for O_SYNC should get it,
even though they're still linked with an old libc.
That's because this thread is the first time I've heard that Linux
O_SYNC was really the weaker O_DSYNC in disguise, and judging from the
many Googlings I've done about O_SYNC in applications and on different
OS, it'll be news to other people too.
(I always thought the "#define O_DSYNC O_SYNC" was because Linux
didn't implement the weaker O_DSYNC).
(Oh, and Ulrich: Why is there a "#define O_RSYNC O_SYNC" in the Glibc
headers? That doesn't make sense: O_RSYNC has nothing to do with
writing.)
To achieve that, libc could implement two versions of open() at the
same time as it updates header files. The new libc's __old_open() would
do:
/* Only O_DSYNC is set for apps built against old libc which
were compiled
if (flags & O_DSYNC)
flags |= O_SYNC;
I'm not exactly sure how symbol versioning works, but perhaps the
header file in the new libc would need __REDIRECT_NTH to map open() to
__new_open(), which just calls the kernel. This is to ensure .o and
.a files built with an old libc's headers but then linked to a new
libc will get __old_open().
Although libc's __new_open() could have this:
/* Old kernels only look at O_DSYNC. It's better than nothing. */
if (flags & O_SYNC)
flags |= O_DSYNC;
Imho, it's better to not do that, and instead have
#define O_SYNC (O_DSYNC|__O_SYNC_KERNEL)
as Chris suggests, in the libc header the same as the kernel header,
because that way applications which use the syscall() function or have
to invoke a syscall directly (I've seen clone-using code doing it),
won't spontaneously start losing their O_SYNCness on older kernels.
Unless there is some reason why "flags &= ~O_SYNC" is not permitted to
clear the O_DSYNC flag, or other reason why they must be separate flags.
-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/