Re: email as a bona fide git transport

From: Theodore Y. Ts'o
Date: Thu Oct 17 2019 - 10:47:23 EST


On Thu, Oct 17, 2019 at 04:01:33PM +0200, Vegard Nossum wrote:
>
> In your example, couldn't Darrick simply base his xfs work on the latest
> xfs branch that was pulled by Linus? That should be up to date with all
> things xfs without having any of the things that made Linus's tree not
> work for him.

Sure, but sometimes there are changes in subsystems which the file
system depends upon that were also merged by Linus. So for example,
the ext4 branch might be based on v5.3-rc4, and gets pulled after v5.3
is released, along with a huge number of other subsystem trees, so the
delta between v5.3 and v5.3-rc1 is ***huge***.

So while I could base my development on my previous ext4.git branch
(based off of v5.3-rc4), at *some* point I need to be able to sync up
with upstream. And the usual way to do this is to start a new
development branch based on (for example) v5.4-rc2, or in some cases
v5.4-rc4.

We could keep the development branch on based off of v5.3-rc4, and
wait until things stablize, and *then* merge in v5.4-rcX, when
v5.4-rcX is finally stable, but that makes for a more complex merge,
and so it means that things like "git log origin..master" don't really
work any more. So the preferred development practice is very much....

rc2 o --> patch 1 --> patch 2 --> ... --> patch N
(origin) (master)

Where the "master" branch gets merged into the rewinding "dev" branch
(which works much like git's pu branch), and where the "master" branch
is what Linus will merge at the next merge window.

> Otherwise, you could apply the stabilisation patches and then do your
> final testing in a branch that merges that with your patchset, like so:
>
> rc1 o -----> fixup A ------> fixup B ---->o merge (tested)
> (base) \ /
> \ /
> ---> patch 001 --> patch 002 -->o patchset (submitted)

I cloud do that, but remember that the checked out kernel tree is
about a gigabyte (this assumes using git clone --shared, so it doesn't
include the git pack files, and this is source only; the object files
are another 2.6 GB). I could keep separate checked out trees, and
separate build trees, but that burns a lot of disk/SSD space. Or I
could switch back and forth by using "git checkout" between the
development branch and the branch with the stablization patches, but
then I'm constantly having to rebuild the object files, and ccache
only helps so much.

So it's much simpler to put the fixup patches at the on top of the
origin, and then mail them out without having to play git branch
rebasing gymnastics. When the patch series is finally ready to roll,
then the maintainer will apply the patch series on a clean branch,
since hopefully by then -rc3 or -rc4 is finally stable enough to use
as an origin point.

So the idea is that developer might be sending out revisions of their
patches on top of -rc1 plus fixup patches, but then the final version
of the patch series, after a few rounds of review, gets applied on top
of -rc3 or -rc4.

> I think the more difficult problem to solve might be how to ensure that
> the base commit is actually public/reachable when this is the intention.
> A bot watching the mailing list could always respond with a "Hey, I
> don't have that, could you rebase the series or push it somewhere?". But
> it would be even better if git could tell you when you're about to
> submit a patch. Maybe something like:
>
> git send-email --ensure-reachable-from [remote] rev^^..
>
> In the worst case, I guess the base commit will just not be available --
> the email will still have a sha1 on it, though, and which might still be
> usable as an identifier for the patch/patchset. If not, it's still not
> worse than the current workflow (which would still work).

... or what we can do is allow the developer to specify the intended
base --- e.g., -rc1, even though his patchset was against "-rc1 plus fixups".

- Ted