Re: trouble with mounting a 1.5 TB raid6 volume in 2.6.19-rc5-mm1

From: Neil Brown
Date: Mon Nov 13 2006 - 18:58:34 EST


On Monday November 13, thunder7@xxxxxxxxx wrote:
> From: Neil Brown <neilb@xxxxxxx>
> Date: Mon, Nov 13, 2006 at 08:43:54PM +1100
> > On Monday November 13, neilb@xxxxxxx wrote:
> > >
> > > Can you try reverting this patch (patch -p1 -R) ?
> > >
> >
> > Yes, I'm confident that reverting that patch will fix your problem.
> > I actually found a number of errors while tracking this down :-(
>
> Oops. Do you have some kind of test-suite to run on md-devices before
> releasing a patch? I didn't think my hardware was that exotic.

Yes, I have a test suite (part of the mdadm package - anyone can try
running it - "make everything ; sh test").
The trouble is that I don't run it as often as I should. I plan to
fix that, but it really needs to run automatically to be reliable - I
can't be trusted to run it when required. :-(
I've taken steps to setting something up to that it does run
automagically every night against various kernels, but there is still
a way to go...

It did catch this raid6 problem when I did run it, but there were a
few other bugs in there that the test suite would not have found
(e.g. one was a memory leak and checking generically for memory leaks
is not trivial).

>
> I had a scary moment just now, since 2.6.19-rc5-mm1 without your patch
> wanted to fsck the volume, and started repairing errors which didn't
> exist on disk. Luckily, it stopped before inode 64, and I was able to
> fsck the volume on an earlier kernel without apparent damage.
>
> Still, a test suite seems like a good idea. Also, getting Andrew to
> either add this as a hot-fix or release -mm2 would be a good idea,
> IMVHO.

I'll suggest something when I submit the patches which fix all the
bugs I found (test-suite still running just at the moment).

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/