Re: Mercurial 0.4b vs git patchbomb benchmark

From: Matt Mackall
Date: Fri Apr 29 2005 - 15:46:31 EST


On Fri, Apr 29, 2005 at 10:30:27PM +0200, Andrea Arcangeli wrote:
> On Thu, Apr 28, 2005 at 11:01:57PM -0700, Matt Mackall wrote:
> > change nodes so you've got to potentially traverse all the commits to
> > reconstruct a file's history. That's gonna be O(top-level changes)
> > seeks. This introduces a number of problems:
> >
> > - no way to easily find previous revisions of a file
> > (being able to see when a particular change was introduced is a
> > pretty critical feature)
> > - no way to do bandwidth-efficient delta transfer
> > - no way to do efficient delta storage
> > - no way to do merges based on the file's history[1]
>
> And IMHO also no-way to implement a git-on-the-fly efficient network
> protocol if tons of clients connects at the same time, it would be
> dosable etc... At the very least such a system would require an huge
> amount of ram. So I see the only efficient way to design a network
> protocol for git not to use git, but to import the data into mercurial
> and to implement the network protocol on top of mercurial.
>
> The one downside is that git is sort of rock solid in the way it stores
> data on disk, it makes rsync usage trivial too, the git fsck is reliable
> and you can just sign the hash of the root of the tree and you sign
> everything including file contents. And of course the checkin is
> absolutely trivial and fast too.

Mercurial is ammenable to rsync provided you devote a read-only
repository to it on the client side. In other words, you rsync from
kernel.org/mercurial/linus to local/linus and then you merge from
local/linus to your own branch. Mercurial's hashing hierarchy is
similar to git's (and Monotone's), so you can sign a single hash of
the tree as well.

> With a more efficient diff-based storage like mercurial we'd be losing
> those fsck properties etc.. but those reliability properties don't worth
> the network and disk space they take IMHO, and the checkin time
> shouldn't be substantially different (still running in O(1) when
> appending at the head). And we could always store the hash of the
> changeset, to give it some basic self-checking.

I think I can implement a decent repository check similar to git, it's
just not been a priority.

--
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/