Re: Your rename change and loopback

Alexander Viro (viro@math.psu.edu)
Sat, 22 May 1999 01:10:37 -0400 (EDT)


On Fri, 21 May 1999, H.J. Lu wrote:

> Hi,
>
> I found out your rename change in kernel 2.2.6-2.2.9 broke loopback
> device. The problem is randoum. I have to run a script, which uses
> the loopback device, for 80 times to reproduce the bug. At 10 minutes
> each, it takes almost 13 hours to reproduce it. When the bug shows
> up, I get the kernel messages like:
>
> EXT2-fs error (device loop(7,0)): ext2_add_entry: bad entry in directory #12: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
> EXT2-fs error (device loop(7,0)): ext2_add_entry: bad entry in directory #12: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
> EXT2-fs error (device loop(7,0)): ext2_add_entry: bad entry in directory #12: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
>
> or
>
> file_cluster badly computed!!! 2989 <> 1949
> file_cluster badly computed!!! 2990 <> 1950
> file_cluster badly computed!!! 2991 <> 1951
> file_cluster badly computed!!! 2992 <> 1952
> file_cluster badly computed!!! 2993 <> 1953
>
> Do you have any ideas? Does you patch take loopback into account?

<sound of dropping jaw>
It lives several layers above. Are you sure that it's rename()?
If anything, buffer.c changes in the same releases seem to be more likely
candidates... Could you give details on the script? (preferably just email
it)... Especially since you are having problems with ext2 - changes there
are minimal. If anything, they might expose dcache corruption from other
sources, but that would hardly manifest itself in that way.

There is one case when I'm *sure* that buffer.c contains a bug able to
cause fs corruption. It can be triggered by consequent set_blocksize()
calls racing with bdflush. It is fixed in 2.3.2 and patch is fairly
trivial. It *might* give the effectes you are describing (especially with
fast dirtify/umount/mount/dirtify sequences), but to say whether it's the
case here I'll need to look at your script.

Which versions did you test, BTW? That might help to localize the
problem. Please, give some details...
Cheers,
Al

PS: fix of buffer.c bug in question being:
diff -urN linux-2.2.6/fs/buffer.c linux-bird.FAT/fs/buffer.c
--- linux-2.2.6/fs/buffer.c Sun Mar 28 14:54:47 1999
+++ linux-bird.misc/fs/buffer.c Sat Apr 10 00:30:01 1999
@@ -672,7 +672,9 @@
clear_bit(BH_Req, &bh->b_state);
bh->b_flushtime = 0;
}
- remove_from_hash_queue(bh);
+ remove_from_queues(bh);
+ bh->b_dev=B_FREE;
+ insert_into_queues(bh);
}
}
}

It has nothing to rename patch, it did make its way into 2.3.2, but not
into 2.2.x (yet). See if it will help...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/