Re: [PATCH] cowlinks v2

From: Tim Connors
Date: Fri Apr 02 2004 - 06:55:03 EST


ebiederm@xxxxxxxxxxxx (Eric W. Biederman) said on 25 Mar 2004 15:16:48 -0700:
> Linus Torvalds <torvalds@xxxxxxxx> writes:
>
> > On Thu, 25 Mar 2004, Jamie Lokier wrote:
> > >
> > > That is not useful for me or the other people who want to use this to
> > > duplicate large source trees and run "diff" between trees.
> > >
> > > "diff" depends on being able to check if files in the two trees are
> > > identical -- by checking whether the inode number and device (and
> > > maybe other stat data) are identical. This allows "diff -ur" between
> > > two cloned trees the size of linux to be quite fast. Without that
> > > optimisation, it's very slow indeed.
> >
> > I think the correct thing to do is to just admit that cowlinks aren't
> > POSIX, and instead see the inode number as a way to see whether the link
> > has been broken or not. Ie just accept the inode number potentially
> > changing.
> >
> > That would make "diff" (adn most other uses) ok with this, and anythign
> > that isn't, just couldn't be used with cowlinked files.
>
> Really?
>
> tar and cp still need to be taught about them. And if they are not taught
> they will do the wrong thing and hard link the files removing the
> copy on write semantics. Which would do ugly thing when you restore
> from your backup.
>
> I don't have a problem with the inode number changing when you write to
> a file, because I can't think of much that would care either way.
> But having the inode number of an open file change sounds like a very
> difficult problem to track.

If the inode doesn't change upon a write, the system is still POSIX,
so nothing breaks (the problem is, cowlinks would be very useful, and
POSIX is very useful - making them incompatible would be a major
bummer)

If you use sendfile() or cowcp() or similar to implement cowlinks,
then only rsync/tar/cp need be taught about its usage. Someone
mentioned that diff will not be optimal with different inodes, because
diff compared inodes. So add some diffstat() that keeps track of some
magical inode like thingy that stays the same for cowed files. Then
teach diff to use diffstat().

Now, you have a system where if your program doesn't explicitly use
cowcp() and diffstat(), everything still works perfectly (we want to
still be able to use old coreutils on a machine that has a new kernel
applied, right? And vice-versa?). All that happens is that your
program may or may not be sub-optimal - you might have to compare
entire files instead of just inodes. Or you may have to copy entire
files instead of just cowing them. But it all still Just Works.

My AU$0.02.

--
TimC -- http://astronomy.swin.edu.au/staff/tconnors/
I bet the human brain is a kludge.
-- Marvin Minsky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/