RE: Re: [PATCH 1/1] lz4: Implement lz4 with dynamic offset length.
From: Vaneet Narang
Date: Tue Apr 03 2018 - 09:44:29 EST
Hi Sergey,
>You shrink a 2 bytes offset down to a 1 byte offset, thus you enforce that
2 Byte offset is not shrinked to 1 byte, Its only 1 bit is reserved out of
16 bits of offset. So only 15 Bits can be used to store offset value.
>'page should be less than 32KB', which I'm sure will be confusing.
lz4_dyn will work on bigger data length(> 32k) but in that case compression
ratio may not be better than LZ4. This is same as LZ4 compressing data more
than 64K (16Bits). LZ4 can't store offset more than 64K similarly
LZ4 dyn can't store offset more than 32K.
There is a handling in LZ4 code for this and similar handling added for LZ4 Dyn.
Handling in LZ4 Dyn: max_distance is 32K for lz4_dyn and will be 64K for LZ4
int max_distance = dynOffset ? MAX_DISTANCE_DYN : MAX_DISTANCE;
>And you
>rely on lz4_dyn users to do the right thing - namely, to use that 'nice'
>`#if (PAGE_SIZE < (32 * KB))'.
They don't need to add this code, they just need to choose right compression algorithm
that fits their requirement. If source length is less than 32K then lz4_dyn
would give better compression ratio then LZ4.
Considering ZRAM as a user for LZ4 dyn, we have added this check for PAGE_SIZE which
is source length. This code adds lz4 dyn to preferred list of compression algorithm
when PAGE size is less than 32K.
>Apart from that, lz4_dyn supports only data
>in up to page_size chunks. Suppose my system has page_size of less than 32K,
>so I legitimately can enable lz4_dyn, but suppose that I will use it
>somewhere where I don't work with page_size-d chunks. Will I able to just
>do tfm->compress(src, sz) on random buffers? The whole thing looks to be
>quite fragile.
No thats not true, lz4_dyn can work for random buffers and it need not be
of page size chunks. There is no difference in Lz4 and Lz4 dyn working.
Only difference is LZ4 dyn doesn't use fixed offset size, this concept already getting
used in LZO which uses dynamic size of Metadata based on Match Length and Match offset.
It uses different markers for this which defines length of meta data.
lzodefs.h:
#define M1_MAX_OFFSET 0x0400
#define M2_MAX_OFFSET 0x0800
#define M3_MAX_OFFSET 0x4000
#define M4_MAX_OFFSET 0xbfff
#define M1_MIN_LEN 2
#define M1_MAX_LEN 2
#define M2_MIN_LEN 3
#define M2_MAX_LEN 8
#define M3_MIN_LEN 3
#define M3_MAX_LEN 33
#define M4_MIN_LEN 3
#define M4_MAX_LEN 9
#define M1_MARKER 0
#define M2_MARKER 64
#define M3_MARKER 32
#define M4_MARKER 16
Similarly for LZ4 Dyn, we have used 1 bit as a marker to determine offset length.
Thanks & Regards,
Vaneet NarangAttachment:
rcptInfo.txt
Description: Binary data