Re: [PATCH] lz4: Fix kernel decompression speed

From: Arvind Sankar
Date: Tue Aug 04 2020 - 11:19:58 EST


On Tue, Aug 04, 2020 at 02:57:50AM +0000, Nick Terrell wrote:
>
>
> > On Aug 3, 2020, at 6:56 PM, Arvind Sankar <nivedita@xxxxxxxxxxxx> wrote:
> >
>
> > -- I see that ZSTD_copy8 is already using __builtin_memcpy,
> > but there must be more that can be optimized? There's a couple 1/2-byte
> > sized copies in huf_decompress.c.
>
> Oh wow, I totally missed that, I guess I stopped looking once performance
> was about what I expected, nice find!
>
> I suspect it is mostly the memcpy inside of HUF_decodeSymbolX4(), since
> that should be the only hot one [1].
>
> Do you want to put up the patch to fix the memcpy’s in zstd Huffman, or should I?
>
> I will be submitting a patch upstream to migrate all of zstd’s memcpy() calls to
> use __builtin_memcpy(), since I plan on updating the version in the kernel to
> upstream zstd in the next few months. I was waiting until the compressed kernel
> patch set landed, so I didn't distract from it.
>
> [0] https://gist.github.com/terrelln/9bd53321a669f62683c608af8944fbc2
> [1] https://github.com/torvalds/linux/blob/master/lib/zstd/huf_decompress.c#L598
>
> Best,
> Nick
>

It's better if you do the zstd changes I think, as I'm not familiar with
the code at all.

Thanks.