Re: [PATCH] lib/zstd: use div_u64() to let it build on 32-bit
From: Nick Terrell
Date: Wed Jun 28 2017 - 01:29:57 EST
> Please don't top post.
Sorry about that.
> Which function needs 1KB of stack space? That's quite a lot.
FSE_buildCTable_wksp(), FSE_compress_wksp(), and HUF_readDTableX4()
required over 1 KB of stack space.
> I can see in [1] that there are some on-stack buffers replaced by
> pointers to the workspace. That's good, but I would like to know if
> there's any hidden gem that grags the precious stack space.
I've been hunting down functions that use up the most stack trace and
replacing buffers with pointers to the workspace. I compiled the code
with -Wframe-larger-than=512 and reduced the stack usage of all offending
functions. In the next version of the patch, no function uses more than
400 B of stack space. We'll be porting the changes back upstream as well.
> Hm, I'd suggest to create a version optimized for kernel, eg. expecting
> that 4+ GB buffer will never be used and you can use the most fittin in
> type. This should affect only the function signatures, not the
> algorithm implementation, so porting future zstd changes should be
> straightforward.
If the functions were exposed, then I would agree 100%. However, since
these are internal functions, and the rest of zstd uses size_t to represent
buffer sizes, I think it would be awkward to change just FSE/HUF functions.
I also prefer size_t because it is friendlier to the optimizer, especially
the loop optimizer, since the compiler doesn't have to worry about unsigned
overflow.
On a related note, zstd performs automatic optimizations to improve
compression speed and reduce memory usage when given small sources, which
is the common case in the kernel.