Re: Testing if two open descriptors refer to the same inode

From: David Sterba
Date: Wed Jul 31 2024 - 14:07:29 EST


On Mon, Jul 29, 2024 at 09:40:57PM +1000, Aleksa Sarai wrote:
> On 2024-07-29, Mateusz Guzik <mjguzik@xxxxxxxxx> wrote:
> > On Mon, Jul 29, 2024 at 12:57 PM Florian Weimer <fweimer@xxxxxxxxxx> wrote:
> > > > On Mon, Jul 29, 2024 at 12:40:35PM +0200, Florian Weimer wrote:
> > > >> > On Mon, Jul 29, 2024 at 08:55:46AM +0200, Florian Weimer wrote:
> > > >> >> It was pointed out to me that inode numbers on Linux are no longer
> > > >> >> expected to be unique per file system, even for local file systems.
> > > >> >
> > > >> > I don't know if I'm parsing this correctly.
> > > >> >
> > > >> > Are you claiming on-disk inode numbers are not guaranteed unique per
> > > >> > filesystem? It sounds like utter breakage, with capital 'f'.
> > > >>
> > > >> Yes, POSIX semantics and traditional Linux semantics for POSIX-like
> > > >> local file systems are different.
> > > >
> > > > Can you link me some threads about this?
> > >
> > > Sorry, it was an internal thread. It's supposed to be common knowledge
> > > among Linux file system developers. Aleksa referenced LSF/MM
> > > discussions.
> >
> > So much for open development :-P
>
> To be clear, this wasn't _decided_ at LSF/MM, it was brought up as a
> topic. There is an LWN article about the session that mentions the
> issue[1].

A discussion about inode numbers or subvolumes comes up every year with
better of worse suggestions what to do about it.

> My understanding is that the btrfs and bcachefs folks independently
> determined they cannot provide this guarantee. As far as I understand,
> the reason why is that inode number allocation on btree filesystems
> stores information about location and some other bits (maybe subvolumes)
> in the bits, making it harder to guarantee there will be no collisions.

No, on btrfs the inode numbers don't encode anything about location,
it's a simple number. The inode numbers remain the same when a snapshot
is taken as it's a 1:1 clone of the file hierarchy, the directory
representing a subvolume/snapshot has fixed inode number 256. The only
difference is the internal subvolume id.