Re: Finding hardlinks

From: Jeff Layton
Date: Thu Dec 28 2006 - 10:55:46 EST


Benny Halevy wrote:
Jeff Layton wrote:
Benny Halevy wrote:
It seems like the posix idea of unique <st_dev, st_ino> doesn't
hold water for modern file systems and that creates real problems for
backup apps which rely on that to detect hard links.

Why not? Granted, many of the filesystems in the Linux kernel don't enforce that they have unique st_ino values, but I'm working on a set of patches to try and fix that.

That's great and will surely help most file systems (apparently not Coda as
Jan says they use 128 bit internal file identifiers).

What about 32 bit architectures? Is ino_t going to be 64 bit
there too?


Sorry, I should qualify that statement. A lot of filesystems don't have permanent i_ino values (mostly pseudo filesystems -- pipefs, sockfs, /proc stuff, etc). For those, the idea is to try to make sure we use 32 bit values for them and to ensure that they are uniquely assigned. I unfortunately can't do much about filesystems that do have permanent inode numbers.

Adding a vfs call to check for file equivalence seems like a good idea to me.
A syscall exposing it to user mode apps can look like what you sketched above,
and another variant of it can maybe take two paths and possibly a flags field
(for e.g. don't follow symlinks).

I'm cross-posting this also to nfsv4@ietf. NFS has exactly the same problem
with <fsid, fileid> as fileid is 64 bit wide. Although the nfs client can
determine that two filesystem objects are hard linked if they have the same
filehandle but there are cases where two distinct filehandles can still refer to
the same filesystem object. Letting the nfs client determine file equivalency
based on filehandles will probably satisfy most users but if the exported
fs supports the new call discussed above, exporting it over NFS makes a
lot of sense to me... What do you guys think about adding such an operation
to NFS?

This sounds like a bug to me. It seems like we should have a one to one correspondence of filehandle -> inode. In what situations would this not be the case?

Well, the NFS protocol allows that [see rfc1813, p. 21: "If two file handles from
the same server are equal, they must refer to the same file, but if they are not
equal, no conclusions can be drawn."]

As an example, some file systems encode hint information into the filehandle
and the hints may change over time, another example is encoding parent
information into the filehandle and then handles representing hard links
to the same file from different directories will differ.


Interesting. That does seem to break the method of st_dev/st_ino for finding hardlinks. For Linux fileservers I think we generally do have 1:1 correspondence so that's not generally an issue.

If we're getting into changing specs, though, I think it would be better to change it to enforce a 1:1 filehandle to inode correspondence rather than making new NFS ops. That does mean you can't use the filehandle for carrying other info, but it seems like there ought to be better mechanisms for that.

-- Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/