Re: Linux 2.4.30-rc3 md/ext3 problems

From: Neil Brown
Date: Mon Mar 28 2005 - 19:12:13 EST


On Monday March 28, vherva@xxxxxxxxxx wrote:
> On Mon, Mar 28, 2005 at 10:34:05AM +0300, [Ville Herva] wrote:
> >
> > I just upgraded from linux-2.4.21 + vserser 0.17 to 2.4.30rc3 + vserver
> > 1.2.10. The box has been running stable with 2.4.21 + vserver 0.17/0.16 for
> > a few years (uptime before reboot was nearly 400 days.)
> >
> > The boot went fine, but after few hours I got
> > Message from syslogd@box at Sun Mar 27 22:07:00 2005 ...
> > kernel: journal commit I/O error

I got that error on 2.4.30-rc1 a couple of times, and now cannot
reproduce it :-(
But if you got it too, then it wasn't just bad luck.

The ext3 code in 2.4.30-rc does have a few more checks for IO errors
which will cause the journal to be aborted and produce this error, so
I suspect that change which caused the problem is a change in ext3.
However that doesn't mean the bug is there.

The extra code in ext3 seems to just check if buffer_uptodate is false
after it has waited on a locked buffer, and triggers a journal abort
if it isn't. This should be perfectly safe, and I cannot find any
logic error near by. But nor can I find any errors that would cause a
buffer returned from raid1 to not be uptodate (unless there really was
an IO error).


NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/