Re: Mercurial vs Updated git HOWTO for kernel hackers

From: Matt Mackall
Date: Tue Jun 28 2005 - 18:02:06 EST


On Tue, Jun 28, 2005 at 06:23:09PM -0400, Sean wrote:
> On Tue, June 28, 2005 6:14 pm, Matt Mackall said:
> >> You can even have a setup where objects
> >> are archived onto write-once media like DVD and still participate in a
> >> live repository, where new objects are written to hard disk, but older
> >> object are (automatically) sourced from the DVD.
> >
> > Have fun with that. It's an excellent way to destroy your DVD drive.
>
> Oh come on, stop the FUD. You pack all the objects up into a single pack
> file (see new feature in Git) and you burn it _once_ to dvd or cdrom.

And even as one big file, it will _still_ be layed out on disk in
pessimal order.

> > Git's completely structureless filename hashing pretty much guarantees
> > that disk layout will degrade to worst-case random access behavior
> > over time. Just walking through the 2000 commit blobs in the current
> > tree can take minutes cold cache on a fast hard disk. Walking the 1700
> > tree blobs in a given version takes quite a while too.
> >
> > Put that on a DVD and that could easily turn into hours of continuous
> > seeking for a simple operation like checking out tip of tree.
> >
> > And as far as I know, ISO9660 and UDF don't really handle huge
> > directories well. So if you try and put the whole kernel history (200k
> > files, some huge number of directory blobs, and 30k-60k commit blobs)
> > on a DVD, you'll be really hurting.
> >
> > Meanwhile the whole history (>30k changesets) with Mercurial fits on a
> > regular CD, with reasonable directory sizes and I/O patterns.
> >
> > Not that it's really worth the trouble. It takes longer for me to burn
> > an ISO image to disc than to download a complete kernel repo from
> > kernel.org.
> >
>
> Git is still developing, there will be new ways to seek and cache things
> etc eventually that remove any performance issue.

Again, have fun with that. Mercurial already went down this path a
month ago, discovered it couldn't reasonably be fixed without
abandoning the hashes as file name scheme, and changed repo layout.

Git's going to have a much harder time as it's pretty solidly tied to
lookup by contents hash. If you throw that out, you might as well use
Mercurial.

--
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/