Re: Reproducable OOPS with MD RAID-5 on 2.6.0-test11

From: Linus Torvalds
Date: Tue Dec 02 2003 - 13:26:36 EST




On Tue, 2 Dec 2003, Jens Axboe wrote:
>
> Smells like a bio stacking problem in raid/dm then. I'll take a quick
> look and see if anything obvious pops up, otherwise the maintainers of
> those areas should take a closer look.

There are several other problem reports which start to smell like md/raid.

> That might be a good idea, although it's not very likely to be an XFS
> problem as it happens further down the io stack. It should trigger just
> as happily on IDE or SCSI if that was the case.

There's one (by Alan Buxey) that I attributed to PREEMPT which happens on
UP with ext3 and raid0:

md: Autodetecting RAID arrays.
md: autorun ...
md: considering hdd1 ...
md: adding hdd1 ...
md: adding hda1 ...
md: created md0
md: bind<hda1>
md: bind<hdd1>
md: running: <hdd1><hda1>
raid1: raid set md0 active with 2 out of 2 mirrors
md: ... autorun DONE.

and that one shows strange memory corruption problems too.

NOTE! The fact that it only happens with PREEMPT for some people is not
necessarily a sign of preempt-only trouble: while PREEMPT should really be
equivalent to SMP-safe, but there are some things that are much more
likely with preemption than with normal SMP.

In particular, preempt will cause every single (final) unlock to check
whether there is something else runnable with a higher priority, so it
opens up races a lot - if you touch something just outside (in particular:
_after_) the locked region, preempt is much more likely to show a race
that on SMP might be just a few instructions long.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/