Re: Finding hardlinks

From: Pavel Machek
Date: Tue Jan 02 2007 - 14:35:54 EST


Hi!

> > >> It seems like the posix idea of unique <st_dev, st_ino> doesn't
> > >> hold water for modern file systems
> > >
> > > are you really sure?
> >
> > Well Jan's example was of Coda that uses 128-bit internal file ids.
> >
> > > and if so, why don't we fix *THAT* instead
> >
> > Hmm, sometimes you can't fix the world, especially if the filesystem
> > is exported over NFS and has a problem with fitting its file IDs uniquely
> > into a 64-bit identifier.
>
> Note, it's pretty easy to fit _anything_ into a 64-bit identifier with
> the use of a good hash function. The chance of an accidental
> collision is infinitesimally small. For a set of
>
> 100 files: 0.00000000000003%
> 1,000,000 files: 0.000003%

I do not think we want to play with probability like this. I mean...
imagine 4G files, 1KB each. That's 4TB disk space, not _completely_
unreasonable, and collision probability is going to be ~100% due to
birthday paradox.

You'll still want to back up your 4TB server...

Pavel
--
Thanks for all the (sleeping) penguins.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/