Re: [RFC PATCH v1 00/30] fs: inode->i_version rework and optimization

From: Dave Chinner
Date: Sat Apr 01 2017 - 19:06:05 EST


On Thu, Mar 30, 2017 at 12:12:31PM -0400, J. Bruce Fields wrote:
> On Thu, Mar 30, 2017 at 07:11:48AM -0400, Jeff Layton wrote:
> > On Thu, 2017-03-30 at 08:47 +0200, Jan Kara wrote:
> > > Because if above is acceptable we could make reported i_version to be a sum
> > > of "superblock crash counter" and "inode i_version". We increment
> > > "superblock crash counter" whenever we detect unclean filesystem shutdown.
> > > That way after a crash we are guaranteed each inode will report new
> > > i_version (the sum would probably have to look like "superblock crash
> > > counter" * 65536 + "inode i_version" so that we avoid reusing possible
> > > i_version numbers we gave away but did not write to disk but still...).
> > > Thoughts?
>
> How hard is this for filesystems to support? Do they need an on-disk
> format change to keep track of the crash counter?

Yes. We'll need version counter in the superblock, and we'll need to
know what the increment semantics are.

The big question is how do we know there was a crash? The only thing
a journalling filesystem knows at mount time is whether it is clean
or requires recovery. Filesystems can require recovery for many
reasons that don't involve a crash (e.g. root fs is never unmounted
cleanly, so always requires recovery). Further, some filesystems may
not even know there was a crash at mount time because their
architecture always leaves a consistent filesystem on disk (e.g. COW
filesystems)....

> I wonder if repeated crashes can lead to any odd corner cases.

WIthout defined, locked down behavour of the superblock counter, the
almost certainly corner cases will exist...

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx