Re: Proposal: Use hi-res clock for file timestamps

From: Neil Brown
Date: Wed Aug 18 2010 - 22:44:32 EST


On Wed, 18 Aug 2010 22:08:03 -0400
"J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote:

> On Thu, Aug 19, 2010 at 10:52:18AM +1000, Neil Brown wrote:
> > On Thu, 19 Aug 2010 09:41:36 +1000
> > Neil Brown <neilb@xxxxxxx> wrote:
> >
> > > So I agree that this is probably more of an issue for directories than for
> > > files, and that implementing it just for directories would be a sensible
> > > first step with lower expected overhead - just my reasoning seems to be a bit
> > > different.
> >
> > Just to be sure we are on the same page:
> > file_update_time would always refer to current_nfsd_time, but nfsd would
> > only update current_nfsd_time when a directory was examined (and the other
> > conditions were met).
> >
> >
> > So my current thinking on how this would look - names have been changed:
> >
> > - global timespec 'current_fs_precise_time' is zeroed when
> > current_kernel_time moves backwards and is protected by a seqlock
> >
> > - current_fs_time would be
> > now = max(current_kernel_time(), current_fs_precise_time)
> > return timespec_trunc(now, sb->s_time_gran)
> > (with appropriate seqlock protection)
> >
> > - new function in fs/inode.c
> > get_precise_time(timestamp)
>
> Odd name for something that returns nothing of interest;
> bump_precise_time() might be closer?
>
> And unique_time might be better than precise_time, since the property
> we're asking for is that mtime on a changed file by new? (Or
> versioned_time?)

Agreed on both counts, tough I'm not keen on 'bump' myself.
got_unique_time()
because that it what we just did... I prefer the name to reflect why the
function is called, rather than what the function is expected to do about it.
never_use_this_timestamp_again(timestamp)
:-?


>
> > cft = current_fs_time()
> > if (timestamp == cft)
> /*
> * Make sure the next mtime stored will be
> * something different from timestamp:
> */
> > write_seqlock()
> > if cft == current_fs_precise_time
> > current_fs_precise_time.tv_nsec++
> > else if cft > current_fs_precise_time
>
> What's the cft < current_fs_precise_time case?

The current_fs_precise_time has been incremented with a resolution higher
than s_time_gran. i.e. s_time_gran > 1.
I'm not really sure what we want to do about that.
Maybe we should be incrementing tv_nsec by s_time_gran as long as that is
significantly less than jiffies_to_usec(1)*1000, but I don't know what I mean
by 'significantly'.

The only values I can find for s_time_gran in current code are 1, 100, 1000
and 1000000000.
All those are either way bigger than a jiffie or significantly smaller, but
suppose a filesystem came along that chose 1000000 (i.e. millisecond
timestamps) - should we increment tv_nsec by 1000000, or not, or cross that
bridge when we come to it?

For reference:
default is 1000000000 (this would cover ext2, ext3, reiserfs, fat, sysv, ...)
cifs, smbfs, ntfs are 100
udf, ceph are 1000
rest (btrfs, ext4, gfs2, jfs, nilfs, ocfs2, xfs and virtual filesystems) are 1

NeilBrown

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/