hi!
it looks like raid5 is buggy. as 2nd device is failed, I got oops.
it's BUG at raid5.c:212. this is because sh->written[N] isn't NULL.
raid5d thread try to handle this stripe, but following condition
skip handling:
/* might be able to return some write requests if the parity block
* is safe, or on a failed drive
*/
bh = sh->bh_cache[sh->pd_idx];
if ( written &&
( (conf->disks[sh->pd_idx].operational && !buffer_locked(bh) && buffer_uptodate(bh))
|| (failed == 1 && failed_num == sh->pd_idx))
) {
I suggest to fail requests which can't be returned as successful.
with best regards
diff -puNr linux/drivers/md/raid5.c edited/drivers/md/raid5.c
--- linux/drivers/md/raid5.c Wed Jan 15 20:18:37 2003
+++ edited/drivers/md/raid5.c Tue Apr 15 12:59:24 2003
@@ -938,7 +938,19 @@ static void handle_stripe(struct stripe_
}
}
}
-
+
+ /* if already written requests can't be returned as successful fail them */
+ if (failed > 1 && written) {
+ for (i=disks; i--; ) {
+ if (sh->bh_written[i]) written--;
+ while ((bh = sh->bh_written[i])) {
+ sh->bh_written[i] = bh->b_reqnext;
+ bh->b_reqnext = return_fail;
+ return_fail = bh;
+ }
+ }
+ }
+
/* Now we might consider reading some blocks, either to check/generate
* parity, or to satisfy requests
*/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Tue Apr 15 2003 - 22:00:35 EST