Re: RAID5 sector exists (2.1.10x)

Gadi Oxman (gadio@netvision.net.il)
Tue, 21 Jul 1998 21:19:27 +0400 (IDT)


On Tue, 21 Jul 1998, Bill Hawes wrote:

> Mike Black wrote:
> >
> > This is REAL consistent -- every one or two days now I'm getting the EXACT
> > same set of error messages on my RAID5 setup. The sector numbers match (the
> > bh and bh_new are different each time). The most recent is on 2.1.109 (has
> > happened on every kernel for a while). This doesn't appear to cause any
> > problems...chkraid and e2fsck always check out OK. Any ideas?
> >
> > >Jul 20 11:54:00 medusa kernel: raid5: bug: stripe->bh_new[1], sector
> > 2400748 exists
> > >Jul 20 11:54:00 medusa kernel: raid5: bh c05d8c20, bh_new c05d88c0
> > >Jul 20 11:54:00 medusa kernel: raid5: bug: stripe->bh_new[2], sector
> > 2400750 exists
>
> Any chance you could work back from the sector and find what files are
> involved? It might help to know what fs operations are being done at the
> time.
>
> Regards,
> Bill
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html

Assuming 32KB chunk size, 3 raid disks, left-symmetric parity placement
algorithm, and a filesystem block size of 1024 bytes, the corresponding
filesystem block numbers are (for 4096 bytes filesystem, we have to divide
by four):

2400726, 2400759, 2400763, 2400815, 2400739, 2400755, 2400751,
2400735, 2400747, 2400730, 2400775, 2400771, 2400743, 2400799,
2400811, 2400767, 2400783, 2400779

If those blocks match to files, I think we can use debugfs as follows:

debugfs /dev/md0
icheck block_number (displays the inode number)
ncheck inode_num

The following RAID-5 patch will display the filesystem block directly.

Gadi

--- linux/drivers/block/raid5.c~ Tue Jul 21 20:41:44 1998
+++ linux/drivers/block/raid5.c Tue Jul 21 20:41:44 1998
@@ -1218,7 +1218,7 @@
if (sh->phase != PHASE_COMPLETE && sh->phase != PHASE_BEGIN)
PRINTK(("stripe %lu catching the bus!\n", sh->sector));
if (sh->bh_new[dd_idx]) {
- printk("raid5: bug: stripe->bh_new[%d], sector %lu exists\n", dd_idx, sh->sector);
+ printk("raid5: bug: stripe->bh_new[%d], sector %lu exists, block %lu\n", dd_idx, sh->sector, compute_blocknr(sh, dd_idx));
printk("raid5: bh %p, bh_new %p\n", bh, sh->bh_new[dd_idx]);
lock_stripe(sh);
md_wakeup_thread(raid_conf->thread);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html