Re: [patch] FS (ext2) corruption generated by BLKFLSBUF/invalidate_buffers (2.2.x and 2.3.x)

Guest section DW (dwguest@win.tue.nl)
Fri, 26 Nov 1999 03:00:20 +0100


On Thu, Nov 25, 1999 at 07:14:05PM +0100, Andrea Arcangeli wrote:

> Jason and Samuel told me they could reliable reproduce fs corruption by
> writing to a filesystem while the ioctl(BLKFLSBUF) is running in parallel
> on the blockdevice where the fs is mounted.

Don't do that, then.

> I understood what was going on: BLKFLSBUF is just an interface for
> invalidate_buffers and invalidate_buffers() trashes away _all_ the buffers
> beloging to such a blockdevice. _Dirty_ buffers included.
>
> Actually the corruption is hided pretty well by both hdparm and the kernel
> doing a sync on the device before starting invalidate_buffers.
>
> The only two cases where we want invalidate_buffers() to flush away _also_
> dirty buffers are:
>
> o upon detected media change
> o releasing fdisk memory
>
> In all other cases where the kernel want to invalidate the buffers, it
> should first sync the device (no need to wait I/O completation, so an
> sync_buffers(dev, 0) is enough if no filesystem is mounted or
> sync_dev(dev) if the filesystem is mounted) and then calling an
> invalidate_buffers that won't trash dirty blocks. So if new dirty blocks
> are generated under invalidate_buffers (legitimate in cases like
> BLKFLSBUF), they won't be dropped.

I hope I misunderstand your "no need to wait I/O completion".
We certainly need that when BLKFLSBUF returns (i) the I/O that
was initiated before it was called has completed, and (ii) the
buffers that were involved in that I/O are now invalid.

The kernel is buggy - if block 1000 of /dev/hda is block 969
of /dev/hda1 and one writes one and reads the other, the read
may well return the contents from before the write.
So, as long as this is not fixed, applications have to resort
to sync() and BLKFLSBUF to try to avoid such problems.

A different reason for BLKFLSBUF is the desire to change media,
especially in cases where the kernel does not recognize the media
as removable.

> In general we should _never_ drop dirty buffers. If we need to drop dirty
> buffers (excluding the ramdisk case) it means we have some kind of fs
> corruption going on.
>
> This is my fix against 2.2.14pre7 (probably it won't apply to previous
> kernel for unrelated but trivially fixable rejects).

I have an aesthetic comment: instead of giving invalidate_buffers()
a second argument, and changing the source in lots of places, why
don't you introduce a second routine invalidate_all_buffers()
that is called two places or so and does the invalidate-also-dirty.
(Then, on the place where these routines are defined, you can have two
separate routines, or a common static do_invalidate() called by both,
whichever you prefer.)
This makes the patch much smaller, and the resulting kernel code better
readable - I don't like routines with flag arguments that modify
the behaviour a little, especially when the flag argument is 0 or 1
and does not have a symbolic name.

Andries

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/