Re: Finding hardlinks

From: Mikulas Patocka
Date: Tue Jan 02 2007 - 16:38:11 EST


Certainly, but tar isn't going to remember all the inode numbers.
Even if you solve the storage requirements (not impossible) it would
have to do (4e9^2)/2=8e18 comparisons, which computers don't have
enough CPU power just yet.

It is remembering all inode numbers with nlink > 1 and many other tools
are remembering all directory inode numbers (see my other post on this
topic).

Don't you mean they are remembering all the inode numbers of the
directories _above_ the one they are currently working on? I'm quite
sure they aren't remembering all the directories they have processed.

cp -a is remembering all directory inodes it has visited, not just path from root. If you have two directories with the same inode number anywhere in the tree, it will skip one of them.

Mikulas

It of course doesn't compare each number with all others, it is
using hashing.

Yes, I didn't think of that.

It doesn't matter if there are collisions within the filesystem, as
long as there are no collisions between the set of files an
application is working on at the same time.

--- that are all files in case of backup.

No, it's usually working with a _single_ file at a time. It will
remember inode numbers of files with nlink > 1, but it won't remember
all the other inode numbers.

You could have a filesystem with 4billion files, each one having two
links. Not a likely scenario though.

Miklos

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/