Re: what is our answer to ZFS?

From: Anton Altaparmakov
Date: Tue Nov 22 2005 - 11:55:01 EST

On Tue, 22 Nov 2005, Chris Adams wrote:
> Once upon a time, Jan Harkes <jaharkes@xxxxxxxxxx> said:
> >The only thing that tends to break are userspace archiving tools like
> >tar, which assume that 2 objects with the same 32-bit st_ino value are
> >identical.
> That assumption is probably made because that's what POSIX and Single
> Unix Specification define: "The st_ino and st_dev fields taken together
> uniquely identify the file within the system." Don't blame code that
> follows standards for breaking.

The standards are insufficient however. For example dealing with named
streams or extended attributes if exposed as "normal files" would
naturally have the same st_ino (given they are the same inode as the
normal file data) and st_dev fields.

> >I think that by now several actually double check that theinode
> >linkcount is larger than 1.
> That is not a good check. I could have two separate files that have
> multiple links; if st_ino is the same, how can tar make sense of it?

Now that is true. In addition to checking the link count is larger then
1, they should check the file size and if that matches compute the SHA-1
digest of the data (or the MD5 sum or whatever) and probably should also
check the various stat fields for equality before bothering with the
checksum of the file contents.

Or Linux just needs a backup api that programs like this can use to
save/restore files. (Analogous to the MS Backup API but hopefully
less horid...)

Best regards,

Anton Altaparmakov <aia21 at> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on
WWW: &
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at