Re: Testing if two open descriptors refer to the same inode
From: Florian Weimer
Date: Mon Jul 29 2024 - 05:30:50 EST
* Aleksa Sarai:
> On 2024-07-29, Florian Weimer <fweimer@xxxxxxxxxx> wrote:
>> It was pointed out to me that inode numbers on Linux are no longer
>> expected to be unique per file system, even for local file systems.
>> Applications sometimes need to check if two (open) files are the same.
>> For example, a program may want to use a temporary file if is invoked
>> with input and output files referring to the same file.
>
> Based on the discussions we had at LSF/MM, I believe the "correct" way
> now is to do
>
> name_to_handle_at(fd, "", ..., AT_EMPTY_PATH|AT_HANDLE_FID)
>
> and then use the fhandle as the key to compare inodes. AT_HANDLE_FID is
> needed for filesystems that don't support decoding file handles, and was
> added in Linux 6.6[1]. However, I think this inode issue is only
> relevant for btree filesystems, and I think both btrfs and bcachefs both
> support decoding fhandles so this should work on fairly old kernels
> without issue (though I haven't checked).
> [1]: commit 96b2b072ee62 ("exportfs: allow exporting non-decodeable file handles to userspace")
Thanks, it's not too bad. The name_to_handle_at manual page says that
the handle is supposed to be treated as an opaque value, although it
mentions AT_HANDLE_FID. I think this needs to be fixed that it's
expected to compare the handle bytes, and also say whether it's
necessary to compare the type or not.
> Lennart suggested there should be a way to get this information from
> statx(2) so that you can get this new inode identifier without doing a
> bunch of extra syscalls to verify that inode didn't change between the
> two syscalls. I have a patchset for this, but I suspect it's too ugly
> (we can't return the full file handle so we need to hash it). I'll send
> an RFC later this week or next.
Hashing these things is rather nasty because it makes things impossible
to test.
>> How can we check for this? The POSIX way is to compare st_ino and
>> st_dev in stat output, but if inode numbers are not unique, that will
>> result in files falsely being reported as identical. It's harmless in
>> the temporary file case, but it in other scenarios, it may result in
>> data loss.
>
> (Another problem is that st_dev can be different for the same mount due
> to subvolumes.)
Uh-oh. If st_dev are different, is it still possible that truncating
one path will affect the other with the different st_dev value?
Thanks,
Florian