Re: Linux-4.X-rcY patches can't be applied with git?

From: Josh Boyer
Date: Mon Oct 24 2016 - 17:02:53 EST


On Mon, Oct 24, 2016 at 3:24 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, Oct 24, 2016 at 11:25 AM, Jarod Wilson <jarod@xxxxxxxxxx> wrote:
>> It's entirely possible that we (Red Hat and the Fedora kernel team) are
>> doing something wrong here, but to the best of our knowledge, it seems
>> that the canonical upstream RC snap patches can't be applied to a tree
>> using either git or old-fashioned patch.
>
> No, you're not imagining it, it's definitely happening.
>
> What is going on is that I generate patches without the "--binary"
> flag, which means that git skips the binary diffs entirely. So the
> diff just contains
>
> Binary files
> a/Documentation/media/media_api_files/typical_media_device.pdf and
> b/Documentation/media/media_api_files/typical_media_device.pdf differ
>
> Then, when you do "git apply", "git apply" will see that, and try to
> use the index lines to regenerate the thing, which obviously only
> works in a repository that already _has_ those objects.
>
> This is actually not new. I've skipped binary files for the last ten
> years or so in the diffs, because the diffs are completely illegible,
> and nobody has ever cared - and non-git users haven't been able to use
> them anyway.
>
> Obviously, part of it is too that we simply don't have very many
> binary files, so it almost never ends up being a problem. The
> documentation changes made them happen now.
>
> But quite frankly, I see the tar-balls and diffs as a way to have
> non-git people have a minimally working system, and their *only* point
> is for people who don't have git.
>
> And since plain "patch" cannot handle git binary diffs anyway, there
> was never any valid reason to include the binary diffs in the diff.
>
> Btw, this is why the diffs don't have renames either.
>
> The diffs would often be much smaller, and certainly much more
> legible, if I used the "-M" or "-C" flags, but since the primary
> reason for the tarballs and diffs existing is for non-git users, and
> traditional "patch" doesn't understand rename diffs, I don't do it.
>
> (Yes, modern GNU patch has actually grown support for rename diffs,
> although last I looked it gets it wrong for some of the more complex
> cases - notably cross renames).
>
> Summary:
>
> - if you have git, you shouldn't use the tar-balls and patches
>
> - if you don't have git, binary patches and renames wouldn't work for
> you anyway, so generating them is pointless and would potentially keep
> you from getting a working tree.
>
> I could easily add "--binary" to my script, because I _think_
> traditional "patch" will just ignore it as noise, but I'd honestly
> rather discourage people from downloading full tar-balls in the first
> place.
>
> Hmm?

The benefit of tarballs and patches from a distribution standpoint is
purely size. And yes, disk is very cheap but the size implications
are magnified by the fact that e.g. SRPMs are required and stored for
every binary build. Today, the Fedora kernel SRPMs are ~94.5MB each.
If we just used git-archive or something to produce a tarball of a git
tree, it would explode to 664MB. That's from a fresh clone of your
tree, run through 'git archive -o ../linux.tar.xz master'. If we just
tarred it up without xz compression, a fresh tree is 1.8GB. That's a
lot of storage, particularly when you're talking about daily builds.

You could just tar up the source files themselves from a git tree and
skip .git, but that means you lose the history and more importantly
you lose any semblance of verified sources because all the signed
commits are then gone. Essentially, it's recreating what is uploaded
on kernel.org *minus* the fact that it is signed. That just seems
dumb.

Perhaps I'm not thinking of something obvious, but it seems
tarballs+patches still have legitimate use cases from a distribution
point of view. Making the patches provided on kernel.org usable by
the tooling we have available seems like a worthwhile goal to me...

josh