Re: The pread/pwrite bug

Stephen C. Tweedie (sct@redhat.com)
Wed, 2 Jun 1999 22:15:18 +0100 (BST)


Hi,

On Wed, 2 Jun 1999 08:55:58 -0400 (EDT), Alan Cox <alan@redhat.com>
said:

>> Well guys, looks like it still isn't working, The server is much slower
>> with the patch and not only throws off Typhoon, but locks it up solid to
>> require a power cycle. Any other Ideas as to where to go now?

> Where does it lock up - what does right-alt-scroll-lock show as the EIP
> if you hit it a few times when th ebox seems hung

And which kernel are you using? We have other problem reports with
2.2.9, still unresolved. If you could try the same patch against
2.2.5 or 2.2.7, that would help to narrow things down.

I have a parallel kernel build running in conjunction with the
pread/pwrite stress running on a 32MB 4-waySMP setup right now with no
apparent problems and no buffer leaks/lockups, using straight
2.2.10-pre2. [update: the runs have now completed twice without any
sign of fault.] The 2.2.9 corruption bug is definitely fixed, and
performance appears unaffected.

However, in case there is a problem with the patch, the diff below
backs out the more complex part of the fix and replaces it with a cure
just for the update_vm_cache() part of the fix. It does not cure the
bug concerning the race during copy_from_user, but it should fix the
Typhoon bug in the simplest manner possible, just by moving the
mark_buffer_uptodate() to before the update_vm_cache() in
ext2_file_write. I won't be able to test anything for the next week
or so as I'll be away for Usenix, so feel free to try both variants of
the fix. I'll be online to respond to any experiences you have,
though.

--Stephen

----------------------------------------------------------------
--- fs/ext2/file.c.~1~ Wed Jun 2 18:49:50 1999
+++ fs/ext2/file.c Wed Jun 2 18:50:52 1999
@@ -162,7 +162,7 @@
struct buffer_head * bh, *bufferlist[NBUF];
struct super_block * sb;
int err;
- int i,buffercount,write_error, new_buffer;
+ int i,buffercount,write_error;

/* POSIX: mtime/ctime may not change for 0 count */
if (!count)
@@ -247,53 +247,24 @@
}
if (c > count)
c = count;
-
- /* Tricky: what happens if we are writing the complete
- * contents of a block which is not currently
- * initialised? We have to obey the same
- * synchronisation rules as the IO code, to prevent some
- * other process from stomping on the buffer contents by
- * refreshing them from disk while we are setting up the
- * buffer. The copy_from_user() can page fault, after
- * all. We also have to throw away partially successful
- * copy_from_users to such buffers, since we can't trust
- * the rest of the buffer_head in that case. --sct */
-
- new_buffer = (!buffer_uptodate(bh) && !buffer_locked(bh) &&
- c == sb->s_blocksize);
-
- if (new_buffer) {
- set_bit(BH_Lock, &bh->b_state);
- c -= copy_from_user (bh->b_data + offset, buf, c);
- if (c != sb->s_blocksize) {
- c = 0;
- unlock_buffer(bh);
- brelse(bh);
+ if (c != sb->s_blocksize && !buffer_uptodate(bh)) {
+ ll_rw_block (READ, 1, &bh);
+ wait_on_buffer (bh);
+ if (!buffer_uptodate(bh)) {
+ brelse (bh);
if (!written)
- written = -EFAULT;
+ written = -EIO;
break;
}
- mark_buffer_uptodate(bh, 1);
- unlock_buffer(bh);
- } else {
- if (!buffer_uptodate(bh)) {
- ll_rw_block (READ, 1, &bh);
- wait_on_buffer (bh);
- if (!buffer_uptodate(bh)) {
- brelse (bh);
- if (!written)
- written = -EIO;
- break;
- }
- }
- c -= copy_from_user (bh->b_data + offset, buf, c);
}
+ c -= copy_from_user (bh->b_data + offset, buf, c);
if (!c) {
brelse(bh);
if (!written)
written = -EFAULT;
break;
}
+ mark_buffer_uptodate(bh, 1);
mark_buffer_dirty(bh, 0);
update_vm_cache(inode, pos, bh->b_data + offset, c);
pos += c;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/