Re: [ OFFTOPIC ] Re: The Linux Kernel Project Management System (INITIAL PROPOSAL)

Larry McVoy (lm@bitmover.com)
Mon, 27 Sep 1999 23:37:58 -0700


On Tue, Sep 28, 1999 at 01:50:25AM -0400, Nat Lanza wrote:
> You don't listen, do you? No matter how much I say "application-level
> checking is good but you still need other measures", all you hear is
> "application-level checking is bad".

Sure I do and I heard you and Peter the first time. And I've been
in this argument a hundred times in a hundred different disciplines.
If you had read Clark's paper (don't they make people read that in OS
101 anymore?), you'd realize that what you are saying is like this:

if I'm going from point A to point Z through B, C, D, E ...
then if each link is safe, then the whole way is safe. I.e.,
if A<->B doesn't screw it up, and B<->C doesn't screw it up,
etc., then it follows that A<->Z doesn't screw it up.

And I am saying that's wrong (which is point of Dave's paper).

You have to do end to end checking, you can't do intermediate checking,
that can't be guarenteed to be safe. This is networking 101 and it is
also systems architecture 101 - any hardware class should teach you this.

You keep telling me that backups and mirroring etc are needed and what
you are missing is that the things doing the backups and mirroring
have no way of knowing if that which they are storing is good or bad.

> > In 10 or more messages you've failed to get the point. Doing backups of
> > bad data is not very useful. How the hell do you know that you need to
> > restore from tape if the system never tells you that anything is wrong?
> > RCS / CVS store the latest version in clear text - every earlier version
> > could be completely corrupted and you'll cluelessly (through no fault of
> > your own) be backing up, mirroring, raiding, and CDROM burning COMPLETE
> > AND UTTER GARBAGE. Which you won't find out until you need to get an
> > _OLD_ version to fix a bug.
>
> Yeah, it's possible. But now we've moved from "totally screwed" to
> "way inconvenienced because I have to play hunt-for-the-tape for a
> long time". No, it's not optimal. I'm not claiming it is. I'm simply
> claiming that it's possible to avoid being screwed without having your
> version control system do lots of fancy error checking.

How? The only way I know of is to save infinite backups to the
beginning of time and go through them all one at a time. Let's see,
you were complaining about the cost of the product. Let's suppose you
have to do this once in the lifetime of your project and it is under
extreme pressure because the NSA needs a bug fix right now to a 5 year
old version of your code and you can't reproduce that 5 year old version
because your repository was corrupted. Now you have a couple hundred
backup tapes to look through to find this.

At this point, should this have happened, exactly how much would you
pay to get rid of the problem? And how hard would you be cursing the
guy that thought backups were a reasonable answer?

> Question: Does BitKeeper prevent me from having garbage in my
> repository? Completely? Or does it just make it less likely? I believe
> it just makes it less likely, which is why I claim that backups and
> disk arrays are still more important.

BK gives you back what you gave it. Period. BK checksums the file
every time you do any operation on it. You do a checkin, checkout, diff,
whatever, it checksums the entire file. It's a pretty shitty checksum,
but none the less, it checksums it. It also checksums each version of
each file; these checksums are passed in patches and checked when the
revision history is sent.

In addition, BK is inherently distributed with each repository having
copies of the revision history. So when a file goes bad, the likelihood
of you being able to get a good copy from some other repository is
extremely high.

BK can't prevent garbage in. I don't claim it does. What I claim is
that BK tries hard to make sure it can reproduce your world as of any
point in time. If you checked in garbage, you get back THE SAME GARBAGE,
not different garbage.

> You claimed that Aegis was worse than BK because of support.

I did no such thing. I pointed out that Aegis has some problems and
there is no plausible long term remedy to those problems. The point
was that the free software model, for these sorts of systems, has had
years and years to prove that it works and it has failed miserably.
This stuff just isn't sexy enough for people to work on it for free and
really solve all the problems. You can argue all you want but they are
hollow arguments - Aegis has been here for 7 or 8 years, CVS for longer,
and yet people feel the problem still isn't solved.

That's not a support issue, that's a business model issue.

> don't care about what anyone who disagrees with you has to
> say. They're all either fools or amateurs or deluded or in some way or
> another idiots who should not be listened to. Only you have the
> answer, and it is BitKeeper. You made it so the Linux kernel could use
> it and love it, and by God we'd better all be grateful. You deserve
> more than the insolence of dissenting opinions, you deserve
> applause.

Nat, I've shipped a lot of commercial code. Mostly in I/O and
file systems. I've spent an enormous amount of effort trying to make
sure that the system doesn't screw up the customers data. I take data
integrity seriously, perhaps because I've worked at places like SGI where
performance mattered more than integrity (the same line of reasoning
which gets RCS into trouble). I've done tons of customer visits where
I got to listen to frustrated customers scream at me for bad decisions
that other people made, decisions which ended up in corrupt data.

All that left me with a healthy respect for user data. And, yes, I think
people that don't take that seriously are very deluded and have absolutely
no business in the source management area (and I'd like to get them out
of operating systems in general, perhaps NFS would actually work then).
Sometimes I think that I'm not yet worried enough about integrity and
it makes my skin crawl to hear other people blithely dismiss it with
statements like "that's what backups are for".

I'm sorry if you feel unheard, or beat up, or whatever. I'm not attacking
you personally, I don't even know you. I'm attacking what sad experience
has taught me are misguided principles. There's a difference. You are
very likely an extremely sharp guy, most opinionated people are. But you
are also saying some things which I know to be incorrect. I'm attacking
those statements, not you.

-- 
---
Larry McVoy            	   lm@bitmover.com           http://www.bitmover.com/lm 

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/