Re: [LSF TOPIC] statx extensions for subvol/snapshot filesystems & more

From: Kent Overstreet
Date: Thu Feb 22 2024 - 04:44:50 EST


On Thu, Feb 22, 2024 at 10:14:20AM +0100, Miklos Szeredi wrote:
> On Wed, 21 Feb 2024 at 22:08, Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:
> >
> > On Wed, Feb 21, 2024 at 04:06:34PM +0100, Miklos Szeredi wrote:
> > > On Wed, 21 Feb 2024 at 01:51, Kent Overstreet <kent.overstreet@xxxxxxxxx> wrote:
> > > >
> > > > Recently we had a pretty long discussion on statx extensions, which
> > > > eventually got a bit offtopic but nevertheless hashed out all the major
> > > > issues.
> > > >
> > > > To summarize:
> > > > - guaranteeing inode number uniqueness is becoming increasingly
> > > > infeasible, we need a bit to tell userspace "inode number is not
> > > > unique, use filehandle instead"
> > >
> > > This is a tough one. POSIX says "The st_ino and st_dev fields taken
> > > together uniquely identify the file within the system."
> > >
> >
> > Which is what btrfs has done forever, and we've gotten yelled at forever for
> > doing it. We have a compromise and a way forward, but it's not a widely held
> > view that changing st_dev to give uniqueness is an acceptable solution. It may
> > have been for overlayfs because you guys are already doing something special,
> > but it's not an option that is afforded the rest of us.
>
> Overlayfs tries hard not to use st_dev to give uniqueness and instead
> partitions the 64bit st_ino space within the same st_dev. There are
> various fallback cases, some involve switching st_dev and some using
> non-persistent st_ino.

Yeah no, you can't crap multiple 64 bit inode number spaces into 64
bits: pigeonhole principle.

We need something better than "hacks".

> What overlayfs does may or may not be applicable to btrfs/bcachefs,
> but that's not my point. My point is that adding a flag to statx does
> not solve anything. You can't just say that from now on btrfs
> doesn't have use unique st_ino/st_dev because we've just indicated
> that in statx and everything is fine. That will trigger the
> no-regressions rule and then it's game over. At least I would expect
> that to happen.
>
> What we can do instead is introduce a new API that is better,

This isn't a serious proposal.