Re: Your rename change and loopback

H.J. Lu (hjl@lucon.org)
Sat, 22 May 1999 08:35:07 -0700 (PDT)


>
>
>
> On Fri, 21 May 1999, H.J. Lu wrote:
>
> > Hi,
> >
> > I found out your rename change in kernel 2.2.6-2.2.9 broke loopback
> > device. The problem is randoum. I have to run a script, which uses
> > the loopback device, for 80 times to reproduce the bug. At 10 minutes
> > each, it takes almost 13 hours to reproduce it. When the bug shows
> > up, I get the kernel messages like:
> >
> > EXT2-fs error (device loop(7,0)): ext2_add_entry: bad entry in directory #12: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
> > EXT2-fs error (device loop(7,0)): ext2_add_entry: bad entry in directory #12: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
> > EXT2-fs error (device loop(7,0)): ext2_add_entry: bad entry in directory #12: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
> >
> > or
> >
> > file_cluster badly computed!!! 2989 <> 1949
> > file_cluster badly computed!!! 2990 <> 1950
> > file_cluster badly computed!!! 2991 <> 1951
> > file_cluster badly computed!!! 2992 <> 1952
> > file_cluster badly computed!!! 2993 <> 1953
> >
> > Do you have any ideas? Does you patch take loopback into account?
>
> <sound of dropping jaw>
> It lives several layers above. Are you sure that it's rename()?
> If anything, buffer.c changes in the same releases seem to be more likely
> candidates... Could you give details on the script? (preferably just email
> it)... Especially since you are having problems with ext2 - changes there

It is a very complicated process. The script plus the files it uses
take about 700MB.

> are minimal. If anything, they might expose dcache corruption from other
> sources, but that would hardly manifest itself in that way.
>

Very possible.

> There is one case when I'm *sure* that buffer.c contains a bug able to
> cause fs corruption. It can be triggered by consequent set_blocksize()
> calls racing with bdflush. It is fixed in 2.3.2 and patch is fairly
> trivial. It *might* give the effectes you are describing (especially with
> fast dirtify/umount/mount/dirtify sequences), but to say whether it's the

My script does have VERY FAST dirtify/umount/mount/dirtify sequences.

> case here I'll need to look at your script.
>
> Which versions did you test, BTW? That might help to localize the
> problem. Please, give some details...
> Cheers,

I have run many tests. The results are

1. 2.2.5 is ok.
2. 2.2.6 has the bug. buffer.c was not changed between 2.2.5 and 2.2.6.
3. 2.2.5 + some rename patch from 2.2.6 has the bug.
4. 2.2.5 + your rename-patch-11, which I believe is similar to the
rename patch from 2.2.6, has the bug.

Everything points to the rename change in 2.2.6. It may not be
the real cause of the bug. But at least it causes the bug to show up.

Thanks.

H.J.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/