Re: [GIT PULL] Driver core fixes for 5.7-rc7 - take 2

From: Linus Torvalds
Date: Sun May 24 2020 - 18:25:23 EST


On Sun, May 24, 2020 at 12:45 PM Sasha Levin <sashal@xxxxxxxxxx> wrote:
>
> Interesting. My thinking around --follow was that it's like
> --full-history in the sense that it won't prune history, but it would
> also keep listing history beyond file renames.

No. It's only completely accidentally like full-history because it
sets the flag that basically says "give me the whole diff" - so that
if the file goes away, you see where it came from.

And because it wants the whole diff and doesn't limit it to just the
one file that is tracked, it ends up following both sides of the merge
because _other_ files changed in that merge.

> The --follow functionality is quite useful when looking at older
> branches and trying to understand where changes should go into on those
> older branches.

It is useful, but it is ambiguous. What happens if the file came to be
two different ways in two different branches? Or what happens if two
files were combined into one?

So "git log --follow" is not _wrong_, but the operation of trying to
follow a file identity is basically broken. In git, it's not a
fundamental operation (because git isn't broken), it's just an
emulation of that broken concept that often works in practice.

It's a "let's give people what they are used to", but it really isn't
very well-defined in the general case. You think it works, because for
the simple cases it gives the "obviously correct" answer.

> We also do have some notion of "file identity" in the kernel;

No, we really really don't.

The CVS/SVN kind of "file identity" is more like an "inode". Nothing
in the kernel sources cares about the inode number of a file. The
inode will be different depending on how something was created, and
when you rename what previously were two different files to one single
path (as a result of a merge), you have to pick one at random, and
lose the other.

So you end up with the crazy random "Attic" model of stale files in
CVS, exactly because the thing is based on a file identity that is
completely fundamentally broken.

Note how you've never seen anything like that in git. Because the
whole concept is garbage, and git isn't garbage.

Yes, I still hate CVS with a passion, almost two decades after I had
to use that horrid horrid thing. Some mental scars will not go away.

>i t's prevalent with "quirk files". Look at these for example:
> [ deleted]
> We know that patches to those files are likely to contain quirks

No, those are not file identities AT ALL.

Those are just pathnames with some meaning. You can throw away the
file, and start a new one, and the meaning doesn't go away - because
it's attached to the path.

And yes, certain paths in the repository can be special, although
that's irrelevant to a SCM, of course. Git won't care. It's just
"contents with a name".

Which is exactly what git tracks, and is *not* what the SVN/CVS kind
of completely broken file identity is all about.

Linus