Re: Writing more than 4096 bytes with O_SYNC flag does not persist all previously written data if system crashes

From: Darrick J. Wong

Date: Tue Feb 24 2026 - 17:24:04 EST


On Tue, Feb 24, 2026 at 06:47:19AM -0800, Christoph Hellwig wrote:
> A lot of folks have already explained the O_SYNC semantics correctly,
> but I have another major question about your test case.
>
> On Wed, Feb 18, 2026 at 04:29:30PM +0300, Vyacheslav Kovalevsky wrote:
> > Detailed description
> > ====================
> >
> > Hello, there seems to be an issue with ext4 crash behavior:
> >
> > 1. Create and sync a new file.
> > 2. Open the file and write some data (must be more than 4096 bytes).
> > 3. Close the file.
> > 4. Open the file with O_SYNC flag and write some data.
> >
> > After system crash the file will have the wrong size and some previously
> > written data will be lost.
>
> The wrong size here seems incorrect. Even if the old data written
> through the non-O_SYNC fd wasn't written out I absolutely can't see how
> the file would have an incorrect size here. Can you please share your
> test case?

He did, way at the beginning: open a file, write 5000 bytes, close it,
open again with O_SYNC, write 300 bytes, close it, force-reboot, and
watch the file come back up with only 4096 bytes written.

I /think/ that's because generic_write_sync only flushes the range that
was dirtied by the write() call, so only the first 4k gets written back
to disk. xfs and ext4 exhibit this behavior; vfat and btrfs persist all
50000 bytes.

--D

#!/bin/bash -x

# Let's see if a small O_SYNC write flushes the rest of the file?

dev="${1:-/dev/sda}"
mnt="${2:-/mnt}"
fstyp="${3:-xfs}"

devsz=$(blockdev --getsz $dev)
test -z "$devsz" && exit 1

umount $dev $mnt

dmsetup remove crap
dmsetup create crap --table "0 $devsz linear $dev 0"
dmdev=/dev/mapper/crap
test -b "$dmdev" || exit 1

rmmod $fstyp

wipefs -a $dmdev
mkfs.$fstyp $dmdev
mount $dmdev $mnt

xfs_io -f -c 'pwrite -S 0x58 0 50000' $mnt/a
xfs_io -s -c 'pwrite -S 0x42 10 300' $mnt/a

dmsetup suspend crap --noflush
dmsetup load crap --table "0 $devsz error"
dmsetup resume crap
dmsetup table
umount $mnt

dmsetup suspend crap
dmsetup load crap --table "0 $devsz linear $dev 0"
dmsetup resume crap

mount $dmdev $mnt
od -tx1 -Ad -c $mnt/a
stat $mnt/a
umount $mnt
dmsetup remove crap