Re: Reproducable OOPS with MD RAID-5 on 2.6.0-test11

From: Simon Kirby
Date: Wed Dec 03 2003 - 20:14:20 EST


On Tue, Dec 02, 2003 at 10:23:17AM -0800, Linus Torvalds wrote:

> On Tue, 2 Dec 2003, Jens Axboe wrote:
> >
> > Smells like a bio stacking problem in raid/dm then. I'll take a quick
> > look and see if anything obvious pops up, otherwise the maintainers of
> > those areas should take a closer look.
>
> There are several other problem reports which start to smell like md/raid.

Btw, I had trouble creating a linear array (probably not very common)
with size > 2 TB. I expected it to just complain, but it ended up
resulting in hard lockups. With a slight size change (removed one
drive), it ended up printing a double fault Oops, and all sorts of neat
stuff. I suspect the hashing was overflowing and writing bits in
unexpected places.

In any event, this patch against 2.6.0-test11 compiles without warnings,
boots, and (bonus) actually works:

--- drivers/md/linear.c.orig 2003-10-29 08:16:35.000000000 -0800
+++ drivers/md/linear.c 2003-12-03 16:19:59.000000000 -0800
@@ -214,10 +214,11 @@
char b[BDEVNAME_SIZE];

printk("linear_make_request: Block %llu out of bounds on "
- "dev %s size %ld offset %ld\n",
+ "dev %s size %lld offset %lld\n",
(unsigned long long)block,
bdevname(tmp_dev->rdev->bdev, b),
- tmp_dev->size, tmp_dev->offset);
+ (unsigned long long)tmp_dev->size,
+ (unsigned long long)tmp_dev->offset);
bio_io_error(bio, bio->bi_size);
return 0;
}
--- include/linux/raid/linear.h.orig 2003-11-26 12:45:44.000000000 -0800
+++ include/linux/raid/linear.h 2003-12-03 16:18:00.000000000 -0800
@@ -5,8 +5,8 @@

struct dev_info {
mdk_rdev_t *rdev;
- unsigned long size;
- unsigned long offset;
+ sector_t size;
+ sector_t offset;
};

typedef struct dev_info dev_info_t;

Simon-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/