Re: Journaling pointless with today's hard disks?

From: H. Peter Anvin (hpa@zytor.com)
Date: Mon Nov 26 2001 - 19:24:00 EST


Followup to: <20011126165920.N730@lynx.no>
By author: Andreas Dilger <adilger@turbolabs.com>
In newsgroup: linux.dev.kernel
>
> The other thing that concerns a journaling fs is write ordering. If you
> can _guarantee_ that an entire track (or whatever) can be written to disk
> in _all_ cases, then it is OK to reorder write requests within that track
> AS LONG AS YOU DON'T REORDER WRITES WHERE YOU SKIP BLOCKS THAT ARE NOT
> GUARANTEED TO COMPLETE.
>
> Generally, in Linux, ext3 will wait on all of the journal transaction
> blocks to be written before it writes a commit record, which is its way
> of guaranteeing that everything before the commit is valid. If you start
> write cacheing the transaction blocks, return, and then write the commit
> record to disk before the other transaction blocks are written, this is
> SEVERELY BROKEN. If it was guaranteed that the commit record would hit
> the platters at the same time as the other journal transaction blocks,
> that would be the minimum acceptable behaviour.
>
> Obviously a working TCQ or write barrier would also allow you to optimize
> all writes before the commit block is written, but that should be an
> _enhancement_ above the basic write operations, only available if you
> start using this feature.
>

Indeed; having explicit write barriers would be a very useful feature,
but the drives MUST default to strict ordering unless reordering (with
write barriers) have been enabled explicitly by the OS.

Furthermore, I would like to add the following constraint to your
writeup:

** For each individual sector, a write MUST either complete or not
   take place at all. In other words, writes are guaranteed to be
   atomic on a sector-by-sector basis.

        -hpa

P.S. Thanks, Andre, for taking the initiative of getting an actual
commit model into the standardized specification. Otherwise we'd be
doomed to continue down the path where what operating systems need for
sane operation and what disk drives provide are increasingly
divergent.

-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@zytor.com>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Nov 30 2001 - 21:00:24 EST