Re: [PATCH] cowlinks v2
From: Eric W. Biederman
Date: Sat Apr 03 2004 - 14:50:06 EST
Jörn Engel <joern@xxxxxxxxxxxxxxxxxxxx> writes:
> On Tue, 30 March 2004 01:16:35 +0200, Pavel Machek wrote:
> >
> > I think they *should* have separate permissions.
>
> That makes the count 2:2. I'll continue to follow the simple solution
> for some time, but wouldn't like to have it included for now (or ever?)
>
> > Also it should be possible to have file with 2 hardlinks cowlinked
> > somewhere, and possibly make more hardlinks of that one... Having
> > pointer to another inode in place where direct block pointers normally
> > are should be enough (thinking ext2 here).
>
> All right, you are proposing hell. I've tried to think through all
> possibilities and was too scared to continue. So limitation is that
> cowlinks and hardlinks are mutually exclusive, which eliminated all
> problems.
Sounds like a simple place to start.
> If you really want cowlinks and hardlinks to be intermixed freely, I'd
> happily agree with you as soon as you can define the behaviour for all
> possible cases in a simple document and none of them make me scared
> again. Show me that it is possible and makes sense.
Concise summary:
- (indirection inodes) solve the hard link problem.
- Don't share the page cache between cow copies.
For intermixing hard links and cow copy operations a single extra layer
of indirection solves the problem. The cow files point to an (indirection)
inode that holds the hard link count, the real inode number and possibly
a few extra things like permissions. Except for never being pointed
at by a directory entry the real inode works like normal. The link
count of the real inode is the number of indirection inodes pointing
at it.
Caching issues are more subtle. The pathological case is a read only
mapping of the cow file, that expects to see the mapping changed by
some other process writing to the file. Not sharing page cache
entries between cow copies solves this problem.
Once page cache entries are distinct it is possible to move the
trigger down from an open with intent to write, to the first
actual write operation. In aops->prepare_write from sys_write,
and aops->writepage from the mmap version.
And a few scenarios to hopefully make things clear.
So your fs starts out as:
/file1 -> ino1 (link count 2) -> data block #1
/file2 -> ino1 (link count 2) -> data block #1
file1 is only 4K long, so I only need to describe one data block.
Actions:
cowcopy(file2, file3):
/file1 -> ino1 (link count 2) -> ino2 (link count 2) -> data block #1
/file2 -> ino1 (link count 2) -> ino2 (link count 2) -> data block #1
/file3 -> ino3 (link count 1) -> ino2 (link count 2) -> data block #1
copyfile(file3, file4):
/file1 -> ino1 (link count 2) -> ino2 (link count 3) -> data block #1
/file2 -> ino1 (link count 2) -> ino2 (link count 3) -> data block #1
/file3 -> ino3 (link count 1) -> ino2 (link count 3) -> data block #1
/file4 -> ino4 (link count 1) -> ino2 (link count 3) -> data block #1
unlink(file2):
/file1 -> ino1 (link count 1) -> ino2 (link count 3) -> data block #1
/file3 -> ino3 (link count 1) -> ino2 (link count 3) -> data block #1
/file4 -> ino4 (link count 1) -> ino2 (link count 3) -> data block #1
link(file4, file5):
/file1 -> ino1 (link count 1) -> ino2 (link count 3) -> data block #1
/file3 -> ino3 (link count 1) -> ino2 (link count 3) -> data block #1
/file4 -> ino4 (link count 2) -> ino2 (link count 3) -> data block #1
/file5 -> ino4 (link count 2) -> ino2 (link count 3) -> data block #1
write(file3):
/file1 -> ino1 (link count 1) -> ino2 (link count 2) -> data block #1
/file3 -> ino3 (link count 1) -> data block #2
/file4 -> ino4 (link count 2) -> ino2 (link count 2) -> data block #1
/file5 -> ino4 (link count 2) -> ino2 (link count 2) -> data block #1
write(file5):
/file1 -> ino1 (link count 1) -> ino2 (link count 1) -> data block #1
/file3 -> ino3 (link count 1) -> data block #2
/file4 -> ino4 (link count 2) -> data block #3
/file5 -> ino4 (link count 2) -> data block #3
write(file1):
/file1 -> ino1 (link count 1) -> data block #1 (with modified contents)
/file3 -> ino3 (link count 1) -> data block #2
/file4 -> ino4 (link count 2) -> data block #3
/file5 -> ino4 (link count 2) -> data block #3
Does this make things clear?
Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/