Re: Software raid0 will crash the file-system, when each disk is5TB

From: Bill Davidsen
Date: Tue May 22 2007 - 17:30:21 EST


Jeff Zheng wrote:
Fix confirmed, filled the whole 11T hard disk, without crashing.
I presume this would go into 2.6.22

Since it results in a full loss of data, I would hope it goes into 2.6.21.x -stable.

Thanks again.

Jeff

-----Original Message-----
From: linux-raid-owner@xxxxxxxxxxxxxxx [mailto:linux-raid-owner@xxxxxxxxxxxxxxx] On Behalf Of Jeff Zheng
Sent: Thursday, 17 May 2007 5:39 p.m.
To: Neil Brown; david@xxxxxxx; Michal Piotrowski; Ingo Molnar; linux-raid@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx
Subject: RE: Software raid0 will crash the file-system, when each disk is 5TB


Yeah, seems you've locked it down, :D. I've written 600GB of data now, and anything is still fine.
Will let it run overnight, and fill the whole 11T. I'll post the result tomorrow

Thanks a lot though.

Jeff

-----Original Message-----
From: Neil Brown [mailto:neilb@xxxxxxx]
Sent: Thursday, 17 May 2007 5:31 p.m.
To: david@xxxxxxx; Jeff Zheng; Michal Piotrowski; Ingo Molnar; linux-raid@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx
Subject: RE: Software raid0 will crash the file-system,
when each disk
is 5TB

On Thursday May 17, neilb@xxxxxxx wrote:
Uhm, I just noticed something.
'chunk' is unsigned long, and when it gets shifted up, we
might lose
bits. That could still happen with the 4*2.75T
arrangement, but is
much more likely in the 2*5.5T arrangement.
Actually, it cannot be a problem with the 4*2.75T arrangement.
chuck << chunksize_bits

will not exceed the size of the underlying device *in*kilobytes*.
In that case that is 0xAE9EC800 which will git in a 32bit long.
We don't double it to make sectors until after we add
zone->dev_offset, which is "sector_t" and so 64bit
arithmetic is used.
So I'm quite certain this bug will cause exactly the problems experienced!!

Jeff, can you try this patch?
Don't bother about the other tests I mentioned, just try this one.
Thanks.

NeilBrown

Signed-off-by: Neil Brown <neilb@xxxxxxx>

### Diffstat output
./drivers/md/raid0.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff .prev/drivers/md/raid0.c ./drivers/md/raid0.c
--- .prev/drivers/md/raid0.c 2007-05-17
10:33:30.000000000 +1000
+++ ./drivers/md/raid0.c 2007-05-17 15:02:15.000000000 +1000
@@ -475,7 +475,7 @@ static int raid0_make_request (request_q
x = block >> chunksize_bits;
tmp_dev = zone->dev[sector_div(x, zone->nb_dev)];
}
- rsect = (((chunk << chunksize_bits) + zone->dev_offset)<<1)
+ rsect = ((((sector_t)chunk << chunksize_bits) +
+zone->dev_offset)<<1)
+ sect_in_chunk;
bio->bi_bdev = tmp_dev->bdev;
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html



--
Bill Davidsen <davidsen@xxxxxxx>
"We have more to fear from the bungling of the incompetent than from
the machinations of the wicked." - from Slashdot
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/