Re: Reproducable OOPS with MD RAID-5 on 2.6.0-test11

From: Nathan Scott
Date: Tue Dec 02 2003 - 05:14:09 EST


On Tue, Dec 02, 2003 at 09:27:13AM +0100, Jens Axboe wrote:
> On Mon, Dec 01 2003, Kevin P. Fleming wrote:
> >
> > without device-mapper in place, though, and I could not reproduce the
> > problem! I copied > 500MB of stuff to the XFS filesystem created using
> > the entire /dev/md/0 device without a single unusual message. I then
> > unmounted the filesystem and used pvcreate/vgcreate/lvcreate to make a
> > 3G volume on the array, made an XFS filesystem on it, mounted it, and
> > tried copying data over. The oops message came back.
>
> Smells like a bio stacking problem in raid/dm then. I'll take a quick
> look and see if anything obvious pops up, otherwise the maintainers of
> those areas should take a closer look.

One thing that might be of interest - XFS does tend to pass
variable size requests down to the block layer, and this has
tripped up md and other drivers in 2.4 in the distant past.

Log IO is typically 512 byte aligned (as opposed to block or
page size aligned), as are IOs into several of XFS' metadata
structures.

> > I'm copying this message to linux-lvm; the original oops message is
> > repeated below for the benefit of those list readers. I've got one more
> > round of testing to do (after the array resyncs itself), which is to try
> > a filesystem other than XFS.
>
> That might be a good idea, although it's not very likely to be an XFS
> problem as it happens further down the io stack. It should trigger just
> as happily on IDE or SCSI if that was the case.

I would tend to agree (but will happily fix things if proven
wrong ;) - I took a brief look through dm & md this afternoon
but nothing obvious jumped out at me. I'm not particularly
familiar with that code though.

cheers.

--
Nathan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/