Re: [rfc] git: combo-blobs

From: Paul Jackson
Date: Mon Apr 11 2005 - 09:48:02 EST


Hmmm ... I have this strong sense that I am about 2 hours away from
smacking my forehead and groaning "Duh - so that's what Ingo meant!"

However, one must play out one's destiny.

Could you provide an example scenario, which results in the creation of
a combo-blob?

The best I can come up with is the following.

Let's say Nick changes one line in the middle of kernel/sched.c
(yeah - I know - unlikely scenario - he usually changes more
than that - nevermind that detail.)

In the days Before Combo Blobs (BCB), git would have been told that
kernel/sched.c was to be picked up, and would have wrapped it up in a
zlib'd blob, sha1summed it, seen it was a new sum, and added that blob
to its objects (or something like this -- I'm still a little fuzzy on
these git details.)

But Nick just downloaded the latest git 1.5.11.1 which has added support
for combo blobs, so now, guessing here, instead of wrapping up the new
sched.c, git instead unwraps the old one, diff's with the new, notices a
couple of long sequences that are unchanged, wraps up both of those
sequences as a couple of relatively large blobs, and wraps up the new
lines that Nick just coded in the middle as a small blob, and puts all
three in the object store, along with another small combo-blob, tying
them all together.

So far, not too bad. Haven't gained anything, and required the
unpacking of a zlib blog we didn't require before, and the running and
analyzing of a diff we didn't require before, but the end result is only
moderately worse - four object blobs instead of one, but of total size
not much larger (well, total size typically 3 disk blocks worse, due to
a slight increase in fragmentation from using 4 blocks to store what
used to be in one.)

But now I get stuck. Unless I throw in something like the interleaved
delta compression that's at the heart of Marc Rochind's old SCCS code
(and Larry's rewrite thereof), I don't see how we ever come to the
practical realization that any of these four new blobs can ever be
reused.

So explain to me again how we ever gain anything with these combo blobs,
while I take a prophylactic aspirin, so the forehead whack won't hurt as
much.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@xxxxxxxxxxxx> 1.650.933.1373, 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/