Re: [RFC v2a 11/12] net: ceph: use vfs_time data type instead of timespec

From: Deepa Dinamani
Date: Sat Feb 13 2016 - 20:46:25 EST


On Sat, Feb 13, 2016 at 2:08 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Fri, Feb 12, 2016 at 01:36:05AM -0800, Deepa Dinamani wrote:
>> The VFS inode timestamps are not y2038 safe as they use
>> struct timespec. These will be changed to use struct timespec64
>> instead and that is y2038 safe.
>> But, since the above data type conversion will break the end
>> file systems, use vfs_time aliases here to access inode times.
>>
>> These timestamps are passed in as arguments to functions
>> using inode timestamps. Hence, these need to change along
>> with vfs to support 64 bit timestamps. vfs_time helps do
>> this transition.
>>
>> Signed-off-by: Deepa Dinamani <deepa.kernel@xxxxxxxxx>
>
> Just a point to highlight the problem with this approach:
>
>> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
>> index f8f2359..1273db6 100644
>> --- a/net/ceph/osd_client.c
>> +++ b/net/ceph/osd_client.c
>> @@ -2401,7 +2401,7 @@ bad:
>> */
>> void ceph_osdc_build_request(struct ceph_osd_request *req, u64 off,
>> struct ceph_snap_context *snapc, u64 snap_id,
>> - struct timespec *mtime)
>> + struct vfs_time *mtime)
>> {
>> struct ceph_msg *msg = req->r_request;
>> void *p;
>
> So this change assumes that mtime is not passed by reference to
> another function. If we change vfs_time to be a timespec64, then
> dereferencing in this function works fine, but passing to another
> function will not because that function will be expecting a
> timespec.
>
> That, indeed, is what happens here. A few lines into this function:
>
> if (req->r_flags & CEPH_OSD_FLAG_WRITE)
> ceph_encode_timespec(p, mtime);
>
> And that function:
>
> static inline void ceph_encode_timespec(struct ceph_timespec *tv,
> const struct timespec *ts)
> {
> tv->tv_sec = cpu_to_le32((u32)ts->tv_sec);
> tv->tv_nsec = cpu_to_le32((u32)ts->tv_nsec);
> }

I'm not sure where you picked up this encode function from.

You might be missing the patches( 9 and 10) before this?:

2b5f8e517c6fd7121fc1b890c51c6256bc21beb6 net: ceph: use vfs_time data
type instead of timespec
ca5b82952a6a522ae058ccede57ba1a71da498c5 fs: ceph: Replace timespec
data type with vfs_time
3a3ac0bdd23284c4f27a7ab1c133056c1a998075 fs: ceph: Change encode and
decode functions to use vfs_time

So encode function actually looks like

-static inline void ceph_decode_timespec(struct timespec *ts,
+static inline void ceph_decode_timespec(struct vfs_time *ts,
const struct ceph_timespec *tv)
{
- ts->tv_sec = (__kernel_time_t)le32_to_cpu(tv->tv_sec);
- ts->tv_nsec = (long)le32_to_cpu(tv->tv_nsec);
+ ts->tv_sec = (s64)(u32)le32_to_cpu(tv->tv_sec);
+ ts->tv_nsec = (long)(u32)le32_to_cpu(tv->tv_nsec);
}

There is a conversion error here which I will be fixing separately
from the series.

Also, there is another commit in my tree that is pointed to in the cover letter
that is also required here:

40219ae801c0284a233ed908b07bb468d219cbc8 net: ceph: Remove
CURRENT_TIME references

Changes have been split so that they can done in manageable chunks
just like how we are not
changing all filesystems at once.

> I think an approach that requires changes to the API without
> actually beign able to verify they are correct, fully propagated or
> don't impact on write/disk formats before the final change of the
> VFS type is not going to fly. This is the sort of subtle bug that
> can occur with type changes, and hence why I think that the fs
> developers should be left to do the conversion of their filesystem
> to support 64 bit times (i.e. approach 2b).

Approach 2b has the same problem of not being able to verify the
conversion before the vfs switch. Consider what happens if you miss changing
one of the instances of direct inode time access. So 2b is also not completely
verifiable that the changes are completely propagated. The only approach that
does this is the 2c because the data types in the individual filesystems are
decoupled from vfs data types from the get go.

Besides, anything omitted by 2a or 2b in the process of conversion should be
easily verifiable when vfs does switch. At this point, there will be
warnings in case of pointer conversion or
errors in case of pass by value, if the data types do not match.

Apart from this, process of how individual filesystems are converted
will help avoid
these bugs. Here is one of the tricks I plan to do (consider example
approach 2a):

1. Change an individual filesystem to use vfs_time.
2. Change vfs to timespec64 and verify that the targeted filesystem
will actually
compile.

Tagging tricks are also useful here.

Keep in mind that timespec = timespec64 already on 64 bit systems.
But, still there might be some tricky cases which should be okay
because it will have to reviewed by individual
filesystem maintainers.

> Any change is going to take a significant amount of testing and
> verification, and that's something we don't have yet. Nobody has
> written any tests for xfstests to verify correct 64 bit timestamp
> behaviour, nor do we have tests to verify 32 bit timestamp behaviour
> on a 64 bit time kernel. These are things that we are going to need;
> all filesystems should behave the same w.r.t. these configurations,
> so we really do need regression tests for this....

Agree. This is needed regardless of what approach is chosen.
And, this is a problem for all filesystems even today.

-Deepa