Re: [PATCH] fs: inode: Reduce volatile inode wraparound risk when ino_t is 64 bit

From: Amir Goldstein
Date: Sat Dec 21 2019 - 03:43:24 EST


On Fri, Dec 20, 2019 at 11:33 PM Darrick J. Wong
<darrick.wong@xxxxxxxxxx> wrote:
>
> On Fri, Dec 20, 2019 at 02:49:36AM +0000, Chris Down wrote:
> > In Facebook production we are seeing heavy inode number wraparounds on
> > tmpfs. On affected tiers, in excess of 10% of hosts show multiple files
> > with different content and the same inode number, with some servers even
> > having as many as 150 duplicated inode numbers with differing file
> > content.
> >
> > This causes actual, tangible problems in production. For example, we
> > have complaints from those working on remote caches that their
> > application is reporting cache corruptions because it uses (device,
> > inodenum) to establish the identity of a particular cache object, but
>
> ...but you cannot delete the (dev, inum) tuple from the cache index when
> you remove a cache object??
>
> > because it's not unique any more, the application refuses to continue
> > and reports cache corruption. Even worse, sometimes applications may not
> > even detect the corruption but may continue anyway, causing phantom and
> > hard to debug behaviour.
> >
> > In general, userspace applications expect that (device, inodenum) should
> > be enough to be uniquely point to one inode, which seems fair enough.
>
> Except that it's not. (dev, inum, generation) uniquely points to an
> instance of an inode from creation to the last unlink.
>

Yes, but also:
There should not exist two live inodes on the system with the same (dev, inum)
The problem is that ino 1 may still be alive when wraparound happens
and then two different inodes with ino 1 exist on same dev.

Take the 'diff' utility for example, it will report that those files
are identical
if they have the same dev,ino,size,mtime. I suspect that 'mv' will not
let you move one over the other, assuming they are hardlinks.
generation is not even exposed to legacy application using stat(2).

Thanks,
Amir.