Re: announce: ext2 compression patch

Todd Fries (tfries@umr.edu)
Fri, 5 Apr 1996 21:57:36 -0600 (CST)


> In article <9603221627.AA20799@gnu.mc.xerox.com>,
> Marty Leisner <leisner@sdsp.mc.xerox.com> wrote:
> >
> >I have to agree with Jerry Pournelle on this (he had a column a few
> >months ago in Byte where he talked about this).
> >
> >On the fly compression to writable file systems is a bad idea...
> >It encrypts your whole file system making recovery impossible...
>
> It should be implemented differently; compression should be
> in a file-basis, not on a device-basis. This way, if a bit or
> byte falls over on your disk, it will just cause one (part of a)
> file to get corrupted.
>
> For example, use a file system with 8K blocks. The file system
> should use 1K blocks internally. Compress every 8K block independently,
> and store the compressed 8K blocks in several 1K blocks.
>
> I splitted bash in 8K blocks, and gzipped -9 every 8K block.
> The total nr. of blocks was 128, while the normal bash is 221 blocks.
> It's not to bad; gzip -9 on bash itself creates a bash.gz of 106
> blocks.

Here I quoate from Antonio's FAQ on his implementation of ext2
compression, the relevant section on how he implements it:

2.1 When does de/compression really occur ?
--------------------------------------------

Compression really occurs when the inode is put, i.e. closed by every
process that have a reference to it. Writing to a compressed file is
done as if the file were not compressed (uncompressed data is written
to the disk), but the file is marked dirty. When the inode is put, and
if the file is dirty, the kernel scans every cluster and compresses
those that are not compressed. Unneeded blocks are freed again.

Decompression occurs when needed, i.e. every time the kernel wants to
read a compressed cluster. Of course, the cluster remains compressed
on the disk. But this means that we will have to uncompress the
cluster every time it is accessed.

He uses, IMHO wisely, the 'chattr +c' to determine that a file has been
or has not been compressed.

Also, it is not the filesystem structures, simply the data that is
compressed. Therefore, more data could be wiped out because it is
compressed in an event of a block device corruption, but the
filesystem itself is no more suceptible to corruption than without
the compression.

(hope I said that right).

-- 
Todd Fries