Re: [PATCH 04/12] fs: ceph: CURRENT_TIME with ktime_get_real_ts()
From: Deepa Dinamani
Date: Thu Jun 01 2017 - 20:57:20 EST
On Thu, Jun 1, 2017 at 5:36 PM, John Stultz <john.stultz@xxxxxxxxxx> wrote:
> On Thu, Jun 1, 2017 at 5:26 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
>> On Thu, Jun 1, 2017 at 6:22 PM, Arnd Bergmann <arnd@xxxxxxxx> wrote:
>>> On Thu, Jun 1, 2017 at 11:56 AM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
>>>> On Sat, Apr 8, 2017 at 8:57 AM, Deepa Dinamani <deepa.kernel@xxxxxxxxx> wrote:
>>>
>>>>> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
>>>>> index 517838b..77204da 100644
>>>>> --- a/drivers/block/rbd.c
>>>>> +++ b/drivers/block/rbd.c
>>>>> @@ -1922,7 +1922,7 @@ static void rbd_osd_req_format_write(struct rbd_obj_request *obj_request)
>>>>> {
>>>>> struct ceph_osd_request *osd_req = obj_request->osd_req;
>>>>>
>>>>> - osd_req->r_mtime = CURRENT_TIME;
>>>>> + ktime_get_real_ts(&osd_req->r_mtime);
>>>>> osd_req->r_data_offset = obj_request->offset;
>>>>> }
>>>>>
>>>>> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
>>>>> index c681762..1d3fa90 100644
>>>>> --- a/fs/ceph/mds_client.c
>>>>> +++ b/fs/ceph/mds_client.c
>>>>> @@ -1666,6 +1666,7 @@ struct ceph_mds_request *
>>>>> ceph_mdsc_create_request(struct ceph_mds_client *mdsc, int op, int mode)
>>>>> {
>>>>> struct ceph_mds_request *req = kzalloc(sizeof(*req), GFP_NOFS);
>>>>> + struct timespec ts;
>>>>>
>>>>> if (!req)
>>>>> return ERR_PTR(-ENOMEM);
>>>>> @@ -1684,7 +1685,8 @@ ceph_mdsc_create_request(struct ceph_mds_client *mdsc, int op, int mode)
>>>>> init_completion(&req->r_safe_completion);
>>>>> INIT_LIST_HEAD(&req->r_unsafe_item);
>>>>>
>>>>> - req->r_stamp = current_fs_time(mdsc->fsc->sb);
>>>>> + ktime_get_real_ts(&ts);
>>>>> + req->r_stamp = timespec_trunc(ts, mdsc->fsc->sb->s_time_gran);
>>>>
>>>> This change causes our kernel_untar_tar test case to fail (inode's
>>>> ctime goes back). The reason is that there is time drift between the
>>>> time stamps got by ktime_get_real_ts() and current_time(). We need to
>>>> revert this change until current_time() uses ktime_get_real_ts()
>>>> internally.
>>>
>>> Hmm, the change was not supposed to have a user-visible effect, so
>>> something has gone wrong, but I don't immediately see how it
>>> relates to what you observe.
>>>
>>> ktime_get_real_ts() and current_time() use the same time base, there
>>> is no drift, but there is a difference in resolution, as the latter uses
>>> the time stamp of the last jiffies update, which may be up to one jiffy
>>> (10ms) behind the exact time we put in the request stamps here.
>>>
>>> Do you still see problems if you use current_kernel_time() instead of
>>> ktime_get_real_ts()?
>>
>> The problem disappears after using current_kernel_time().
>>
>> https://github.com/ceph/ceph-client/commit/2e0f648da23167034a3cf1500bc90ec60aef2417
>
> From the commit above:
> "It seems there is time drift between ktime_get_real_ts() and
> current_kernel_time()"
>
> Its more of a granularity difference. current_kernel_time() returns
> the cached time at the last tick, where as ktime_get_real_ts() reads
> the clocksource hardware and returns the immediate time.
>
> Filesystems usually use the cached time (similar to
> CLOCK_REALTIME_COARSE), for performance reasons, as touching the
> clocksource takes time.
Alternatively, it would be best for this code also to use current_time().
I had suggested this in one of the previous versions of the patch.
The implementation of current_time() will change when we switch vfs to
use 64 bit time. This will prevent such errors from happening again.
But, this also means there is more code reordering for these modules
to get a reference to inode.
-Deepa