Re: [GIT PULL][PATCH v4 0/9] Update to zstd-1.4.6
From: Nick Terrell
Date: Thu Oct 01 2020 - 14:36:47 EST
> On Oct 1, 2020, at 3:18 AM, David Sterba <dsterba@xxxxxxx> wrote:
>
> On Wed, Sep 30, 2020 at 08:49:49PM +0000, Nick Terrell wrote:
>>> On Sep 29, 2020, at 11:53 PM, Nick Terrell <nickrterrell@xxxxxxxxx> wrote:
>>>
>>> From: Nick Terrell <terrelln@xxxxxx>
>>
>> It has been brought to my attention that patch 3 hasn’t made it to patchwork,
>> likely because it is too large. I’ll include a pull request in the next cover letter,
>> together with the patches (if needed).
>
> The patch 3/9 saved to a file is 1.6M, over 35000 lines, the diffstat
> says:
>
> 66 files changed, 24268 insertions(+), 12889 deletions(-)
>
> Seriously, this is wrong in so many ways. There's the rationale for
> one-time change etc, but the actual result is beyond what I would accept
> and would not encourage anyone to merge as-is.
I’m open to suggestions on how to get a zstd update done better. I don’t
know of any way to break this patch up into smaller patches that all compile.
The code is all generated directly from upstream and modified to work in the
kernel by automated scripts.
I think the benefits of updating zstd are pretty clear: bug fixes, 3 years of testing,
features, debuggability, support from zstd upstream, and significant performance
improvements.
So I hope we can come up with a way forward to get this merged.
This large of a patch is a one-time change. But, the zstd updates in general
will be large, containing 100s of commits worth of changes (as opposed to
~3500 and a structure change in this diff). E.g. the upstream diff between
two upstream versions range from 50KB - 500KB. Zstd is an actively
maintained project, so there is going to be churn when consuming it. But it
also means that we’re actively supporting the project if any problems occur.
My view is that kernel developers don’t need to review upstreams zstd’s code. We
should focus on the diff from upstream, and ensuring that everything works in the
kernel environment. The imported code from upstream zstd is ~30K LOC, which is
too large for anyone to reasonably review.
As mentioned in the patch, this commit shows the diff from upstream zstd, which
is much more manageable:
https://github.com/terrelln/linux/commit/467c9ea1df1100db48c020c3c8b282a2a30f5116
I’ve generated it by importing upstream zstd as-is into the kernel file structure. Then
running the automation to generate the kernel patch from upstream and importing
it into the kernel on top of the upstream patch.
Best,
Nick