Re: [Y2038] [RFC 02/15] vfs: Change all structures to support 64 bit time

From: Andreas Dilger
Date: Sat Jan 16 2016 - 14:14:48 EST

> On Jan 15, 2016, at 9:50 AM, Arnd Bergmann <arnd@xxxxxxxx> wrote:
> On Friday 15 January 2016 13:49:37 Dave Chinner wrote:
>> On Thu, Jan 14, 2016 at 11:46:16PM +0100, Arnd Bergmann wrote:
>>> On Friday 15 January 2016 08:00:01 Dave Chinner wrote:
>>>> On Thu, Jan 14, 2016 at 05:53:21PM +0100, Arnd Bergmann wrote:
>>>>> On Thursday 14 January 2016 08:04:36 Dave Chinner wrote:
>>> Yes, that is the obvious case, and I guess works for at least half the
>>> file systems when they always assign righthand side and lefthand
>>> side of the time stamps using the external types or helpers like
>>> CURRENT_TIME and current_fs_time().
>>> However, there are a couple of file systems that need a bit more refactoring
>>> before we can do this, e.g. in ntfs_truncate:
>> Sure, and nfs is a pain because of all it's internal use of
>> timespecs, too.
> lustre is probably the worst.

Lustre currently only has one-second granularity in a 64-bit field,
so it doesn't really care about the difference between timespec or
timespec64 at all.

The only other uses are for measuring relative times, so the 64-bitness
shouldn't really matter.

Could you please point out what issues exist so they can be fixed.

Cheers, Andreas

>> But we have these timespec_to_timespec64 helper
>> functions, and that's what we should use in these cases where the
>> filesystem cannot support full 64 bit timestamps internally. In
>> those cases, they'll be telling the superblock this at mount time
>> things like current_fs_time() won't be returning then a timestamp
>> that is out of range for a 32 bit timestamp....
> I'm not worried about the runtime problems here, just how to
> get a series of patches that are each doing one reasonable thing
> at a time.
>>> if (!IS_NOCMTIME(VFS_I(base_ni)) && !IS_RDONLY(VFS_I(base_ni))) {
>>> struct timespec now = current_fs_time(VFS_I(base_ni)->i_sb);
>>> int sync_it = 0;
>>> if (!timespec_equal(&VFS_I(base_ni)->i_mtime, &now) ||
>>> !timespec_equal(&VFS_I(base_ni)->i_ctime, &now))
>>> sync_it = 1;
>>> VFS_I(base_ni)->i_mtime = now;
>>> VFS_I(base_ni)->i_ctime = now;
>>> }
>>> The type of the local variable must match the return code of
>>> current_fs_time(), so if we change over i_mtime and current_fs_time
>>> globally, this either has to be rewritten first to avoid the use of
>>> local variables, or it needs temporary conversion helpers, or
>>> it has to be changed in the same patch. None of those is particularly
>>> appealing. There are a few dozen such things in various file systems.
>> it gets rewritten to:
>> struct timespec now;
>> now = timespec64_to_timespec(current_fs_time(VFS_I(base_ni)->i_sb));
>> ....
>> and the valid timestamp range for ntfs is set to 32bit timestamps.
>> This then leaves it up to the filesystem developers to make
>> the ntfs filesystem code 64 bit timestamp clean if the on disk
>> format is ever changed to support 64 bit times.
>> Same goes for NFS, and any of the other filesystems that use struct
>> timespec internally for time representation.
> That is what I meant in my previous mail about approach c) being ugly
> because it requires sprinkling lots of timespec64_to_timespec() and
> timespec_to_timespec64() in the initial patch in order to atomically
> change the types in inode/iattr/kstat/... without introducing build
> regressions.
> It's a rather horrible patch, and quite likely to cause conflicts with
> other patches that introduce another use of those structures in the
> merge window.
>>> Having a global sysctl knob or
>>> a compile-time option is better than having each file system
>>> implementor take a guess at what users might prefer, if we can't
>>> come up with a behavior (e.g. clamp all the time, or error out
>>> all the time) that everybody agrees is always correct.
>> filesystem implementors will use the helper funtions that are
>> provided for this. If they don't (like all the current use of
>> CURRENT_TIME), then that's a bug that needs fixing.
> Ok, then we are in total agreement here: the policy remains
> to be decided by common code, but the implementation can differ
> per file system.
>> i.e. we need
>> a timespec_clamp() function, similar to timespec_trunc(), and y2038k
>> compliant filesystems and syscalls need to use them....
> I was thinking we end up with a single function that does both
> clamp() and trunk(), but that's an implementation detail.
>>>>> Let me clarify what my idea is here: I want a global kernel option
>>>>> that disables all code that has known y2038 issues. If anyone tries
>>>>> to build an embedded system with support beyond 2038, that should
>>>>> disable all of those things, including file systems, drivers and
>>>>> system calls, so we can reasonably assume that everything that works
>>>>> today with that kernel build will keep working in the future and
>>>>> not break in random ways.
>>>> It's not that black and white when it comes to filesystems. y2038k
>>>> support is determined by the on-disk structure of the filesystem
>>>> being mounted, and that is determined at mount time. When the
>>>> filesystem mounts and sets it's valid timestamp ranges the VFS
>>>> will need to decide as to whether the filesystem is allowed to
>>>> continue mounting or not.
>>> Some file systems are always broken around 2038 (e.g. HFS in 2040),
>>> so if we can't fix them, I want to be able to turn them off
>>> in Kconfig along with the 32-bit time_t syscalls.
>> That can be done with kconfig depends rules - it has nothing to do
>> with this patch set.
> kconfig dependencies is what I meant for the simple cases where a
> file system is known to always be broken, we just need a small
> modification for the cases you mentioned below.
>>> ext2/3/4, xfs and ocfs2 (maybe one or two more, I'd have to check)
>>> currently behave in a consistent manner across 32-bit and 64-bit
>>> architectures by allowing a range between 1902 and 2037, and we
>>> obviously don't have a choice there but to keep that current
>>> behavior, and extend the time format in one way or another to
>>> store additional bits for the epoch.
>> That's a filesystem implementation problem, not a generic inode
>> timestamp problem. i.e. this is handled when the filesystem converts
>> the inode timestamp from a timespec64 in the struct inode to
>> whatever format it stores the timestamp on disk. That conversion
>> does not change just because the VFS inode moves from a timespec to
>> a timespec64. Again, those on-disk format changes to support beyond
>> the current epoch are outside the scope of this patchset, because
>> they are not affected by the timestamp format the VFS choses to use.
> Fine with me, we can have another series to add the Kconfig dependencies
> and modify the file systems that need this.
> Arnd
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at

Cheers, Andreas

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail