[patch] fix for some ext2 race in 2.2.13

Andrea Arcangeli (andrea@suse.de)
Sun, 7 Nov 1999 17:39:30 +0100 (CET)


I spotted and fixed some subtle ext2 race in 2.2.13.

o in inode.c and balloc.c we are zeroing the block and then we are
marking the buffer uptodate and dirty. But while we write to the
buffer, the buffer can be locked and under read-IO from the
block_dev.c layer. So the underlying read may invalidate our zeros
with random on-disk data (random as the buffer was not allocated
previously). This is only a data corruption trouble and it has to
deal with truncates beyond the end of the file.

o In fsync we are not waiting I/O completation before returning
from the syscall. So we could return even if the blockdevice layer
has not yet passed the I/O request to the device driver (so the
data could never hit the disk unless somebody else
will generate I/O).

o while doing the second pass (the wait pass) of fsync we can't
assume that there's been an I/O error if the buffer is requested
and not updodate. The buffer may be under I/O and not uptodate
becuase somebody started a read-IO on the buffer from block_dev.c.
This may lead to wrong -EIO errors from fsync (I also written a
testcase to reproduce this and I effectively got -EIO from fsync
and my patch fixed it).

Fix against 2.2.13:

diff -urN 2.2.13/fs/ext2/balloc.c ext2/fs/ext2/balloc.c
--- 2.2.13/fs/ext2/balloc.c Sun Oct 31 23:31:31 1999
+++ ext2/fs/ext2/balloc.c Sat Nov 6 02:32:48 1999
@@ -603,6 +603,8 @@
unlock_super (sb);
return 0;
}
+ if (!buffer_uptodate(bh))
+ wait_on_buffer(bh);
memset(bh->b_data, 0, sb->s_blocksize);
mark_buffer_uptodate(bh, 1);
mark_buffer_dirty(bh, 1);
diff -urN 2.2.13/fs/ext2/fsync.c ext2/fs/ext2/fsync.c
--- 2.2.13/fs/ext2/fsync.c Sun Oct 31 23:31:31 1999
+++ ext2/fs/ext2/fsync.c Sat Nov 6 04:25:40 1999
@@ -45,10 +45,21 @@
if (!bh)
return 0;
if (wait && buffer_req(bh) && !buffer_uptodate(bh)) {
- brelse (bh);
- return -1;
+ /* There can be a parallell read(2) that started read-I/O
+ on the buffer so we can't assume that there's been
+ an I/O error without first waiting I/O completation. */
+ wait_on_buffer(bh);
+ if (!buffer_uptodate(bh))
+ {
+ brelse (bh);
+ return -1;
+ }
}
if (wait || !buffer_uptodate(bh) || !buffer_dirty(bh)) {
+ if (wait)
+ /* when we return from fsync all the blocks
+ must be _just_ stored on disk */
+ wait_on_buffer(bh);
brelse (bh);
return 0;
}
@@ -68,10 +79,21 @@
if (!bh)
return 0;
if (wait && buffer_req(bh) && !buffer_uptodate(bh)) {
- brelse (bh);
- return -1;
+ /* There can be a parallell read(2) that started read-I/O
+ on the buffer so we can't assume that there's been
+ an I/O error without first waiting I/O completation. */
+ wait_on_buffer(bh);
+ if (!buffer_uptodate(bh))
+ {
+ brelse (bh);
+ return -1;
+ }
}
if (wait || !buffer_uptodate(bh) || !buffer_dirty(bh)) {
+ if (wait)
+ /* when we return from fsync all the blocks
+ must be _just_ stored on disk */
+ wait_on_buffer(bh);
brelse (bh);
return 0;
}
diff -urN 2.2.13/fs/ext2/inode.c ext2/fs/ext2/inode.c
--- 2.2.13/fs/ext2/inode.c Sun Oct 31 23:31:31 1999
+++ ext2/fs/ext2/inode.c Sat Nov 6 21:48:10 1999
@@ -121,6 +121,8 @@
"cannot get block %lu", result);
return 0;
}
+ if (!buffer_uptodate(bh))
+ wait_on_buffer(bh);
memset(bh->b_data, 0, inode->i_sb->s_blocksize);
mark_buffer_uptodate(bh, 1);
mark_buffer_dirty(bh, 1);

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/