Re: filesystem bug?

From: Stephen C. Tweedie
Date: Thu Dec 18 2003 - 16:31:36 EST


Hi,

On Wed, 2003-12-17 at 23:24, Tsuchiya Yoshihiro wrote:
> Tsuchiya Yoshihiro wrote:
>
> > Especially with Ext2 reproducing is easy, it happens in a few hours
> > with my script.
> > With Ext3, in a day if you are lucky.

I've seen plenty of problems which seem more easily reproduced with one
fs over another but which turned out to be due to either bad hardware or
a kernel bug somewhere completely different in the system. But the
basic knowledge that it happens on multiple filesystems is really
helpful to eliminate possibilities.

> > Following is the failed combination:
> > Redhat9 with 2.4.20-8 ext2 and ext3
> > Redhat9 with 2.4.20-19.9 ext2 and ext3
> > Redhat9 with 2.4.20-24.9 ext2
>
> I forgot to mention that I had been testing 2.4.20 from kernel.org
> also.... And it failed now!

This looks more and more like either bad hardware, or a specific device
driver problem. What storage is being used here?

It could possibly be a core VFS bug, but the VFS is in general pretty
reliable under load. We've had problems under specific edge conditions
such as races between sync and unmount, but the basic VFS behaviour
under load generally gets _lots_ of testing, so I'd definitely start by
looking elsewhere.

I'd also like to see how your 2.4.23 and 2.6.0-test11 testing is going.
That might give some clues, too. There's a race between clear_inode()
and read_inode() fixed in those kernels, but that doesn't look relevant
here; there may be something else changed that's significant, though.

Cheers,
Stephen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/