Re: sata badness in 2.6.20-rc1? [Was: Re: md patches in -mm]

From: Rafael J. Wysocki
Date: Fri Dec 15 2006 - 16:16:43 EST


On Friday, 15 December 2006 22:05, Andrew Morton wrote:
> On Sat, 16 Dec 2006 07:50:01 +1100
> Neil Brown <neilb@xxxxxxx> wrote:
>
> > On Friday December 15, thunder7@xxxxxxxxx wrote:
> > > From: Neil Brown <neilb@xxxxxxx>
> > > Date: Wed, Dec 06, 2006 at 06:20:57PM +1100
> > > > i.e. current -mm is good for 2.6.20 (though I have a few other little
> > > > things I'll be sending in soon, they aren't related to the raid6
> > > > problem).
> > > >
> > > 2.6.20-rc1-mm1 doesn't boot on my box, due to the fact that e2fsck gives
> > >
> > > Buffer I/O error on device /dev/md0, logical block 0
> > >
> >
> > But before that....
> > > raid5: device sdh1 operational as raid disk 1
> > > raid5: device sdg1 operational as raid disk 0
> > > raid5: device sdf1 operational as raid disk 5
> > > raid5: device sde1 operational as raid disk 6
> > > raid5: device sdd1 operational as raid disk 7
> > > raid5: device sdc1 operational as raid disk 3
> > > raid5: device sdb1 operational as raid disk 2
> > > raid5: device sda1 operational as raid disk 4
> > > raid5: allocated 8462kB for md0
> > > raid5: raid level 6 set md0 active with 8 out of 8 devices, algorithm 2
> > > RAID5 conf printout:
> > > --- rd:8 wd:8
> > > disk 0, o:1, dev:sdg1
> > > disk 1, o:1, dev:sdh1
> > > disk 2, o:1, dev:sdb1
> > > disk 3, o:1, dev:sdc1
> > > disk 4, o:1, dev:sda1
> > > disk 5, o:1, dev:sdf1
> > > disk 6, o:1, dev:sde1
> > > disk 7, o:1, dev:sdd1
> > > md0: bitmap initialized from disk: read 15/15 pages, set 1 bits, status: 0
> > > created bitmap (233 pages) for device md0
> > > md: super_written gets error=-5, uptodate=0
> > > raid5: Disk failure on sde1, disabling device. Operation continuing on 7 devices
> > > md: super_written gets error=-5, uptodate=0
> > > raid5: Disk failure on sdg1, disabling device. Operation continuing on 6 devices
> > > md: super_written gets error=-5, uptodate=0
> > > raid5: Disk failure on sdf1, disabling device. Operation continuing on 5 devices
> > > md: super_written gets error=-5, uptodate=0
> > > raid5: Disk failure on sdc1, disabling device. Operation continuing on 4 devices
> > > md: super_written gets error=-5, uptodate=0
> > > raid5: Disk failure on sdb1, disabling device. Operation continuing on 3 devices
> > > md: super_written gets error=-5, uptodate=0
> > > raid5: Disk failure on sdh1, disabling device. Operation continuing on 2 devices
> > > md: super_written gets error=-5, uptodate=0
> > > raid5: Disk failure on sdd1, disabling device. Operation continuing on 1 devices
> > > md: super_written gets error=-5, uptodate=0
> > > raid5: Disk failure on sda1, disabling device. Operation continuing on 0 devices
> >
> > Oh dear, that array isn't much good any more.!
> > That is the second report I have had of this with sata drives. This
> > was raid456, the other was raid1. Two different sata drivers are
> > involved (sata_nv in this case, sata_uli in the other case).
> > I think something bad happened in sata land just recently.
> > The device driver is returning -EIO for a write without printing any messages.
> >
>
> OK, this is bad. The wheels do appear to have fallen off sata in rc1-mm1.

I think it's happened in 2.6.19-mm1 already, since that kernel breaks md RAID1
on my box (the sata_uli case above).

Greetings,
Rafael


--
If you don't have the time to read,
you don't have the time or the tools to write.
- Stephen King
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/