Certainly, but tar isn't going to remember all the inode numbers.
Even if you solve the storage requirements (not impossible) it would
have to do (4e9^2)/2=8e18 comparisons, which computers don't have
enough CPU power just yet.
It is remembering all inode numbers with nlink > 1 and many other tools
are remembering all directory inode numbers (see my other post on this
topic).
Don't you mean they are remembering all the inode numbers of the
directories _above_ the one they are currently working on? I'm quite
sure they aren't remembering all the directories they have processed.
-It of course doesn't compare each number with all others, it is
using hashing.
Yes, I didn't think of that.
It doesn't matter if there are collisions within the filesystem, as
long as there are no collisions between the set of files an
application is working on at the same time.
--- that are all files in case of backup.
No, it's usually working with a _single_ file at a time. It will
remember inode numbers of files with nlink > 1, but it won't remember
all the other inode numbers.
You could have a filesystem with 4billion files, each one having two
links. Not a likely scenario though.
Miklos