Re: kernel decompressor interface

From: Phillip Lougher
Date: Tue Mar 30 2010 - 20:25:49 EST


Ferenc Wagner wrote:
Hi,

While working with SquashFS code recently, I got the impression that the
current decompress_fn interface isn't best suited for general use: it
rules out real scatter/gather operation, which -- one hopes -- is a
general feature of stream decompressors. For example, if one has to
decompress data from a series of buffer_heads into a bunch of (cache)
pages (typical operation in compressed file systems), the inflate
interface in zlib.h provides the possibility of changing input and
output buffer addresses, but decompress_fn does not, necessitating extra
memory copying. On the other hand, the latter is admittedly simpler.


The decompress_fn interface is rather limited, however, it must
be borne in mind that it was adequate for the original intended
users (initramfs/initrd decompression). Squashfs (and other filesystems) on
the other hand can certainly make use of a much better multi-call interface.
My strategy in adding LZMA support to Squashfs has been to get an implementation
using the current interface mainlined, and one this has been done to look at
improving the decompress_fn interface.

However, I feel like their advantages could be combined by making the
fill() and flush() callbacks able to change the buffer addresses, like
int fill(void **, unsigned int) for example. Even better, some context
could also be passed around, so that they can easily carry private state.

A cursory look at the LZMA decompressor implementation suggests that
this would be straightforward change (although I haven't done it).

LZMA decompressors have a quirk in that they use the output buffer
as the history buffer (e.g. look for peek_old_byte() in decompress_unlzma.c).
This means any multi-call interface such as zlib which modifies the output
buffer pointer dynamically (without allowing the decompressor to look back at
previously passed in buffers) won't work. A multi-call interface that
passes the output buffers in an iovec style array should work though
(incidentally this is why Squashfs passes the output buffers as an array
to the decompressor wrapper even though LZMA cannot as yet make use of it)

The following original thread about inclusion of LZMA into the kernel
has some relevant information.

http://www.mail-archive.com/linux-embedded@xxxxxxxxxxxxxxx/msg01371.html

Phillip

Do
you think that
1. it could be done for other decompressors (lzo, bunzip) as well, and
2. it should be done in any form (interface replacement or extension)?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/