Re: Transparent compression in the FS

From: Jeff Garzik
Date: Thu Oct 16 2003 - 20:17:52 EST


Christopher Li wrote:
The idea of this sort of block level hashing to allow
sharing of identical blocks seems attractive but i wouldn't
trust any design that did not accept as given that there
would be false positives. This means that a write would
have to not only hash the block but then if there is a
collision do a compare of the raw data. Then you have to
add the overhead of having lists of blocks that match a hash
value and reference counts for each block itself. Further,


Then write every data block will need to dirty at least 2 blocks.
And it also need to read back the original block if hash exist.
There must be some performance hit.


In my case at least, we're talking about archival storage. Plan9 uses a "write buffer" of 1-2GB or so, to mitigate performance loss, which seems reasonable. With archival storage and hash indexes and such, you're certainly going to be dirtying more disk blocks than a traditional local filesystem would.

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/