Re: [RFC/PATCH 0/2] ext4: Transparent Decompression Support

From: Dhaval Giani
Date: Thu Jul 25 2013 - 11:16:16 EST


On 07/24/2013 07:36 PM, JÃrn Engel wrote:
On Wed, 24 July 2013 17:03:53 -0400, Dhaval Giani wrote:
I am posting this series early in its development phase to solicit some
feedback.
At this state, a good description of the format would be nice.

Sure. The format is quite simple. There is a 20 byte header followed by an offset table giving us the offsets of 16k compressed zlib chunks (The 16k is the default number, it can be changed with the use of szip tool, the kernel should still decompress it as that data is in the header). I am not tied to the format. I used it as that is what being used here. My final goal is the have the filesystem agnostic of the compression format as long as it is seekable.


We are implementing transparent decompression with a focus on ext4. One
of the main usecases is that of Firefox on Android. Currently libxul.so
is compressed and it is loaded into memory by a custom linker on
demand. With the use of transparent decompression, we can make do
without the custom linker. More details (i.e. code) about the linker can
be found at https://github.com/glandium/faulty.lib
It is not quite clear what you want to achieve here.

To introduce transparent decompression. Let someone else do the compression for us, and supply decompressed data on demand (in this case a read call). Reduces the complexity which would otherwise have to be brought into the filesystem.

One approach is
to create an empty file, chattr it to enable compression, then write
uncompressed data to it. Nothing in userspace will ever know the file
is compressed, unless you explicitly call lsattr.

If you want to follow some other approach where userspace has one
interface to write the compressed data to a file and some other
interface to read the file uncompressed, you are likely in a world of
pain.
Why? If it is going to only be a few applications who know the file is compressed, and read it to get decompressed data, why would it be painful? What about introducing a new flag, O_COMPR which tells the kernel, btw, we want this file to be decompressed if it can be. It can fallback to O_RDONLY or something like that? That gets rid of the chattr ugliness.

Assuming you use the chattr approach, that pretty much comes down to
adding compression support to ext4. There have been old patches for
ext2 around that never got merged. Reading up on the problems
encountered by those patches might be instructive.

Do you have subjects for these? When I googled for ext4 compression, I found http://code.google.com/p/e4z/ which doesn't seem to exist, and checking in my LKML archives gives too many false positives.

Thanks!
Dhaval
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/