Re: [man-pages RFC PATCH v4] statx, inode: document the new STATX_INO_VERSION field

From: Jeff Layton
Date: Wed Sep 07 2022 - 08:58:33 EST


On Wed, 2022-09-07 at 08:20 -0400, J. Bruce Fields wrote:
> On Wed, Sep 07, 2022 at 09:37:33PM +1000, NeilBrown wrote:
> > On Wed, 07 Sep 2022, Jeff Layton wrote:
> > > +The change to \fIstatx.stx_ino_version\fP is not atomic with respect to the
> > > +other changes in the inode. On a write, for instance, the i_version it usually
> > > +incremented before the data is copied into the pagecache. Therefore it is
> > > +possible to see a new i_version value while a read still shows the old data.
> >
> > Doesn't that make the value useless? Surely the change number must
> > change no sooner than the change itself is visible, otherwise stale data
> > could be cached indefinitely.
>
> For the purposes of NFS close-to-open, I guess all we need is for the
> change attribute increment to happen sometime between the open and the
> close.
>
> But, yes, it'd seem a lot more useful if it was guaranteed to happen
> after. (Or before and after both--extraneous increments aren't a big
> problem here.)
>
>

For NFS I don't think they would be.

We don't want increments due to reads that may happen well after the
initial write, but as long as the second increment comes in fairly soon
after the initial one, the extra invalidations shouldn't be _too_ bad.

You might have a reader race in and see the interim value, but we'd
probably want the reader to invalidate the cache soon after that anyway.
The file was clearly in flux at the time of the read.

Allowing for this sort of thing is why I've been advocating against
trying to define this value too strictly. If we were to confine
ourselves to "one bump per change" then it'd be hard to pull this off.

Maybe this is what we should be doing?

> >
> > If currently implementations behave this way, surely they are broken.
> >

--
Jeff Layton <jlayton@xxxxxxxxxx>