fsck more often when powerfail is detected (was Re: wishfulthinking about atomic, multi-sector or full MD stripe width, writesin storage)

From: Pavel Machek
Date: Sun Apr 04 2010 - 09:47:53 EST


> > Yes, but ext3 was designed to handle the partial write (according to
> > tytso).
> I'm not sure what made you think that I said that. In practice things
> usually work out, as a conseuqence of the fact that ext3 uses physical
> block journaling, but it's not perfect, becase...

Ok; so the journalling actually is not reliable on many machines --
not even disk drive manufacturers guarantee full block writes AFAICT.

Maybe there's time to reviwe the patch to increase mount count by >1
when journal is replayed, to do fsck more often when powerfails are

> > > Also, when you enable the write cache (MD or not) you are buffering
> > > multiple MB's of data that can go away on power loss. Far greater (10x)
> > > the exposure that the partial RAID rewrite case worries about.
> >
> > Yes, that's what barriers are for. Except that they are not there on
> > MD0/MD5/MD6. They actually work on local sata drives...
> Yes, but ext3 does not enable barriers by default (the patch has been
> submitted but akpm has balked because he doesn't like the performance
> degredation and doesn't believe that Chris Mason's "workload of doom"
> is a common case). Note though that it is possible for dirty blocks
> to remain in the track buffer for *minutes* without being written to
> spinning rust platters without a barrier.

So we do wrong thing by default. Another reason to do fsck more often
when powerfails are present?
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/