Re: Transparent compression in the FS

From: P
Date: Thu Oct 16 2003 - 11:42:40 EST


Andrea Arcangeli wrote:
Hi Jeff,

On Wed, Oct 15, 2003 at 11:13:27AM -0400, Jeff Garzik wrote:

Josh and others should take a look at Plan9's venti file storage method -- archival storage is a series of unordered blocks, all of which are indexed by the sha1 hash of their contents. This magically coalesces all duplicate blocks by its very nature, including the loooooong runs of zeroes that you'll find in many filesystems. I bet savings on "all bytes in this block are zero" are worth a bunch right there.

I had a few ideas on the above.

if the zero blocks are the problem, there's a tool called zum that nukes
them and replaces them with holes. I use it sometime, example:

Yes post processing with this tool is useful.
Also note gnu cp (--sparse) inserts holes
in files also.

I thought a bit about this also and thought
that in general the overhead of instant block/file
duplicate merging is not worth it. Periodic
post processing with a merging tool is
much more efficient IMHO. Of course this is
now only possible at the file level, but this
is all that generally useful I think. I guess
it's appropriate to plug my merging tool here :-)
http://www.pixelbeat.org/fslint

Pádraig.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/