Re: [PATCH v8 0/5] fs: multigrain timestamps for XFS's change_cookie
From: Amir Goldstein
Date: Sat Sep 23 2023 - 03:16:04 EST
On Fri, Sep 22, 2023 at 8:15 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>
> My initial goal was to implement multigrain timestamps on most major
> filesystems, so we could present them to userland, and use them for
> NFSv3, etc.
>
> With the current implementation however, we can't guarantee that a file
> with a coarse grained timestamp modified after one with a fine grained
> timestamp will always appear to have a later value. This could confuse
> some programs like make, rsync, find, etc. that depend on strict
> ordering requirements for timestamps.
>
> The goal of this version is more modest: fix XFS' change attribute.
> XFS's change attribute is bumped on atime updates in addition to other
> deliberate changes. This makes it unsuitable for export via nfsd.
>
> Jan Kara suggested keeping this functionality internal-only for now and
> plumbing the fine grained timestamps through getattr [1]. This set takes
> a slightly different approach and has XFS use the fine-grained attr to
> fake up STATX_CHANGE_COOKIE in its getattr routine itself.
>
> While we keep fine-grained timestamps in struct inode, when presenting
> the timestamps via getattr, we truncate them at a granularity of number
> of ns per jiffy,
That's not good, because user explicitly set granular mtime would be
truncated too and booting with different kernels (HZ) would change
the observed timestamps of files.
> which allows us to smooth over the fuzz that causes
> ordering problems.
>
The reported ordering problems (i.e. cp -u) is not even limited to the
scope of a single fs, right?
Thinking out loud - if the QERIED bit was not per inode timestamp
but instead in a global fs_multigrain_ts variable, then all the inodes
of all the mgtime fs would be using globally ordered timestamps
That should eliminate the reported issues with time reorder for
fine vs coarse grained timestamps.
The risk of extra unneeded "change cookie" updates compared to
per inode QUERIED bit may exist, but I think it is a rather small overhead
and maybe worth the tradeoff of having to maintain a real per inode
"change cookie" in addition to a "globally ordered mgtime"?
If this idea is acceptable, you may still be able to salvage the reverted
ctime series for 6.7, because the change to use global mgtime should
be quite trivial?
Thanks,
Amir.