Re: [alsa-devel] HG -> GIT migration

From: Linus Torvalds
Date: Wed May 21 2008 - 13:44:16 EST

Next message: Adrian Bunk: "[2.6 patch] scsi/advansys.c: fix compile errors"
Previous message: Greg Smith: "PostgreSQL pgbench performance regression in 2.6.23+"
In reply to: Takashi Iwai: "Re: [alsa-devel] HG -> GIT migration"
Next in thread: Linus Torvalds: "Re: [alsa-devel] HG -> GIT migration"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, 21 May 2008, Takashi Iwai wrote:
>
> Well, what I meant is about the fixes to the subsystem (say, ALSA) by
> people in the outside. Not every ALSA-bugfix patch goes into the
> upstream from ALSA tree. You, Andrew and others pick individually
> ALSA-fix patches. They will be missing in the ALSA subsystem tree.

Well, that's actually fairly rare, but when it happens, either:

- if you didn't get the fix (ie you're are just seeing random patches go
in that happen to touch alsa), why should you then merge the WHOLE TREE
with all my experimental stuff anyway? You can largely ignore it,
knowing it's fixed, and when you ask me to pull, we'll have a good
end result.

- if you got the same fix as a patch, just apply it to your tree (ie just
ignore what happens upstream). This happens all the time - people
duplicate patches simply because two people apply it.

But the real issue is here is that my tree sometimes gets ten THOUSAND
commits during the merge window. Do you really want to pull those
thousands of commits into your tree just for one or two possible ALSA
fixes?

In _my_ tree, at least the people involved with asking me to pull end up
also having (a) people test it and (b) aware that it's in my tree, so they
work on trying to fix it. But if ALSA just merges at random times, neither
of those two cases are true. Nobody will know about or test some random
state that ALSA merged into its own tree.

Ask yourself (and ignore the ALSA parts - think of some totally
*different* development area) which you think is better

- developing in one area based on a stable base, with the people who do
development in that area knowing about that area.

- or develop on top of a churning sea of thousands of changes to other
sub-areas that you don't know anything about?

In other words, the reason I ask people to not do lots of merges is more
than just "it looks confusing". It's literally a matter of "it's bad
development practice". It causes problems. The confusing history is
actually *real* - it's not just a "visual artifact" of looking at the
result in gitk. The confusing history is a real phenomenon, and implies
that people are doing development not based on some tested base.

> And, what if that you need a fix for the fix that isn't in ALSA
> tree...? IMO, either a rebase or a merge is better than
> cherry-picks.

First off, I don't see why you even need cherry-picks in the first place.
I think your argument is bogus, and you're making it because you want to
get the end result, not because the argument is valid on its own.

Here, let's see what I committed to the sound subsystem since 2.6.24
(ignoring merges):

git log --no-merges v2.6.24.. --committer=torvalds sound/

and look over that list. Remember: this is not some short timeframe, this
is over TWO whole merge windows, ie this is way more commits than we would
normally _ever_ get out of sync over.

Realistically, which of those commits aren't (a) either already from you
sent to me just as a way to get a quick fix into my tree without merging
the whole thing or (b) stuff that can't just be in my tree and doesn't
have to be in the ALSA tree until the next release?

Honestly, now: does *any* of those commits look like "we should merge all
the other changes just because we need that commit _now_ in ALSA"?

I really doubt it.

So I'd seriously suggest submaintainers merge *AT*MOST* once a week, and
preferably much much less often than that. There simply isn't any real
reason to do it more often. Because it can cause problems.

That's why my suggested rule is:

- merge with mainline at major releases

This is "safe". Yes, releases still have bugs, but on the other hand,
they have much fewer problems than random git trees of the day, so they
are a lot safer targets to merge.

- merge with mainline if you know there are real conflicts that need to
be resolved.

This isn't "safe", but it's about trying to resolve conflicts early, so
at some point the downside of merging with a "random point" is smaller
than the downside of delaying the merge!

but perhaps the most important rule is that things should never be
*really* black-and-white, and in the end the really fundamental rule
should be:

- Use your own judicious good sense, and merge at other points as
necessary, but just keep in mind that a merge is a big change.

Yes, merging with git may be technically really really trivial and take
all of two seconds of your time, but:

(a) you *do* potentially get thousands of new commits that aren't
actually related to your work and that you probably don't know
well.
(b) others, when they look at your history, will have a harder time
following it.

so while I can give you a few guidelines, in the end those guidelines are
just _examples_ of when merges can make sense. You need to understand what
the impact of a merge is - and that while git makes merging technically
pretty damn trivial most of the time, a merge should still be a big deal,
and something you think about.

So the kinds of merges I *really* dislike are the ones that are basically
"let's do a regular merge every day to keep up-to-date". That's fine if
you don't do any development at all and "git pull" is just basically a
"track the current development kernel for testing", but if it involves a
merge, it means that there is something wrong in your development model.

> But, my question is about the divergence between the development and
> for-linus branches: how to apply patches that exist only in for-linus
> tree back.

How often does it happen? And how big/important are those? I really think
it's probably a "maybe once or twice a release cycle".

And then, the actual answer can be different depending on the details. For
example, there are really three things you can do:

- ignore it. Is it a cleanup patch (like the sparse patches) or just
fairly trivial stuff that doesn't matter in real life ("remove
duplicated unlikely()" patch or the /proc fixups)

This is often the right thing to do. You _will_ merge eventually
anyway, we know that. I'd expect merges to happen at least once in the
development cycle, maybe twice.

Yes, the patch may touch the sound system, but do you really _care_
about it happening rigth now, or can you just wait until the next merge
you do?

- cherry-pick it. Is it a small, simple patch that you want, but that
isn't really worth pulling in all the other stuff that you simply don't
know?

This isn't wrong. It shouldn't be *common*, but it's not wrong to have
the same patch in two different branches. It makes sense if it is
something you really want, but it's still not important or complex
enough to actually mege everything else!

- and finally: merge. It really can be the RightThing(tm). Is it a
biggish infrastructure change? Is it a series of several related and
dependent commits?

In other words: is it something big enough that you'd rather merge
everything else too (which at least has gotten tested together)? If so,
merging is absolutely the right thing to do!

So merging on its own is not "wrong or evil" at all. Merging is a very
good operation to do, but *mindless* merging is bad. That's really all
that I'm really trying to argue against.

If you thought it through, and decided that yes, you really want to merge,
then you should merge. I just think a lot of people merge without even
thinking about all the other things it involves, just because git made it
*so* easy to do.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Adrian Bunk: "[2.6 patch] scsi/advansys.c: fix compile errors"
Previous message: Greg Smith: "PostgreSQL pgbench performance regression in 2.6.23+"
In reply to: Takashi Iwai: "Re: [alsa-devel] HG -> GIT migration"
Next in thread: Linus Torvalds: "Re: [alsa-devel] HG -> GIT migration"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]