Re: [PATCH] cowlinks v2

From: Jamie Lokier
Date: Fri Apr 02 2004 - 15:10:51 EST


Pavel Machek wrote:
> Okay, now I have to start talking about implementation. Assume ext2 as
> a base. Theres new object "cowid" which contains, well, id for
> get_data_id() and usage count. Each inode either has pointer to
> "cowid" object, or it is plain old regular file.

Pavel has it exactly right.

A simple way to store COWID objects in the filesystem itself is as
another ordinary inode. The attributes of that inode (mtime, mode
etc.) aren't important (except to fsck), only the size and data
pointers are important. The files which point to a COWID need a flag
to indicate that, too.

Someone said that a problem with using sendfile() to create cowlinks
is that sendfile() takes a length parameter. It's a size_t, which
isn't large enough to copy Large files.

Actually that isn't a problem. The COWID inodes contain the real
length of the shared data. It would be possible for the individual
cowlink inodes to have a smaller length, indicating that they don't
have all the data. Then a cp implementation which calls sendfile()
repeatedly could copy large files and still share the data. Provided
the offset and length are compatible with sharing, the first
sendfile() would create the COWID (if it doesn't exist already), and
subsequent ones would simply enlarge the individual cowlink's length
attribute. It's not the cleanest interface; I only mention it because
sendfile() exists already and could be used.

get_data_id() is one way to detect equivalent files. Another would be
a function files_equal(fd1, fd2) which returns a boolean.
get_data_id() has the advantage that it can report immediately whether
a file has _any_ cowlink peers, which is important for programs that
scan trees. Perhaps getxattr() would be reasonable interface, using a
named attribute "data-id".

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/