Re: [PATCH] fat: fix corruption in fat_alloc_new_dir()
From: Jan Stancek
Date: Tue Sep 10 2019 - 12:27:13 EST
----- Original Message -----
> Jan Stancek <jstancek@xxxxxxxxxx> writes:
> >> Using the device while mounting same device doesn't work reliably like
> >> this race. (getblk() is intentionally used to get the buffer to write
> >> new data.)
> > Are you saying this is expected even if 'usage' is just read?
> Yes, assuming exclusive access.
Seems we were lucky so far to only hit this with FAT.
I also tried couple variations of reproducer:
- Disabling udevd and running just "blkid --probe" in parallel
also reproduced it
- Disabling udevd and running read() on first 1024 sectors in parallel
also reproduced it
- aio_read() submitted prior to mount could reproduce it,
as long as fd was held open
- I couldn't reproduce it with fadvise/madvise WILLNEED submitted prior to mount
> >> mount(2) internally opens the device by EXCL mode, so I guess udev opens
> >> without EXCL (I dont know if it is intent or not).
> > I gave this a try and added O_EXCL to udev-builtin-blkid.c. My system had
> > trouble
> > booting, it was getting stuck on mounting LVM volumes.
> > So, I'm not sure how to move forward here.
> OK. I'm still think the userspace should avoid to use blockdev while
> mounting though, this patch will workaround this race with small race.
https://systemd.io/BLOCK_DEVICE_LOCKING.html mentions flock(LOCK_EX) as a way
to avoid probing while "another program concurrently modifies a superblock or
partition table". Adding flock(LOCK_EX) works around the problem too, but that
would address problem only for LTP (and tools/scripts that use this approach).
> Can you test this?
> OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>
> [PATCH] fat: Workaround the race with userspace's read via blockdev while
I ran reproducer on patched kernel for 5 hours, it made over 25000 iterations,
there was no corruption. Thank you for looking at this.