Re: [PATCH v3 2/2] hv_utils: implement Hyper-V PTP source

From: Vitaly Kuznetsov
Date: Tue Jan 17 2017 - 12:27:48 EST


Stephen Hemminger <stephen@xxxxxxxxxxxxxxxxxx> writes:

> On Tue, 17 Jan 2017 16:27:19 +0100
> Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> wrote:
>
>> With TimeSync version 4 protocol support we started updating system time
>> continuously through the whole lifetime of Hyper-V guests. Every 5 seconds
>> there is a time sample from the host which triggers do_settimeofday[64]().
>> While the time from the host is very accurate such adjustments may cause
>> issues:
>> - Time is jumping forward and backward, some applications may misbehave.
>> - In case an NTP server runs in parallel and uses something else for time
>> sync (network, PTP,...) system time will never converge.
>> - Systemd starts annoying you by printing "Time has been changed" every 5
>> seconds to the system log.
>>
>> Instead of doing in-kernel time adjustments offload the work to an
>> NTP client by exposing TimeSync messages as a PTP device. Users may now
>> decide what they want to use as a source.
>>
>> I tested the solution with chrony, the config was:
>>
>> refclock PHC /dev/ptp0 poll 3 precision 1e-9
>>
>> The result I'm seeing is accurate enough, the time delta between the guest
>> and the host is almost always within [-10us, +10us], the in-kernel solution
>> was giving us comparable results.
>>
>> I also tried implementing PPS device instead of PTP by using not currently
>> used Hyper-V synthetic timers (we use only one of four for clockevent) but
>> with PPS source only chrony wasn't able to give me the required accuracy,
>> the delta often more that 100us.
>>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
>
> Looks good. Minor style comments.
>
>> ---
>> drivers/hv/hv_util.c | 140 ++++++++++++++++++++++++++++++++++++++++++---------
>> 1 file changed, 115 insertions(+), 25 deletions(-)
>>
>> diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
>> index 94719eb..e49c5f3 100644
>> --- a/drivers/hv/hv_util.c
>> +++ b/drivers/hv/hv_util.c
>
>> +static inline u64 get_timeadj_latency(u64 ref_time)
>
> inline not necessary on static functions. GCC inlines anyway
>

Even when we have multiple call sites? Interesting...

>> +{
>> + u64 current_tick;
>> +
>> + if (ts_srv_version <= TS_VERSION_3)
>> + return 0;
>> +
>> + /*
>> + * Some latency has been introduced since Hyper-V generated
>> + * its time sample. Take that latency into account before
>> + * using TSC reference time sample from Hyper-V.
>> + *
>> + * This sample is given by TimeSync v4 and above hosts.
>> + */
>> +
>> + rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick);
>
> Personal preference is not to add blank line between comment
> and associated code.
>
> ...
>

Ok.

>> +
>> +struct ptp_clock_info ptp_hyperv_info = {
>
> This could be static?
> Could it be const?
>

Could be both I think.

>> + .name = "hyperv",
>> + .enable = hv_ptp_enable,
>> + .adjtime = hv_ptp_adjtime,
>> + .adjfreq = hv_ptp_adjfreq,
>> + .gettime64 = hv_ptp_gettime,
>> + .settime64 = hv_ptp_settime,
>> + .owner = THIS_MODULE,
>> +};
>> +
>> +static struct ptp_clock *hv_ptp_clock;
>> +
>> static int hv_timesync_init(struct hv_util_service *srv)
>> {
>> INIT_WORK(&wrk.work, hv_set_host_time);
>> +
>> + hv_ptp_clock = ptp_clock_register(&ptp_hyperv_info, NULL);
>> + if (IS_ERR_OR_NULL(hv_ptp_clock)) {
>> + pr_err("cannot register PTP clock: %ld\n",
>> + PTR_ERR(hv_ptp_clock));
>
> Why not return error to init routine in case of failure.
>
>> + hv_ptp_clock = NULL;
>
> Why not return error to init routine? Rather than having user
> scan log.
>

The idea here was to not depend on CONFIG_PTP_1588_CLOCK. In case
CONFIG_PTP_1588_CLOCK is disabled ptp_clock_register() will return NULL
but the Hyper-V timesync driver remains functional - it still handles
the ICTIMESYNCFLAG_SYNC case, just the ptp device will be missing.
We can:
1) Put PTP-related code under #ifdef CONFIG_PTP_1588_CLOCK
2) Handle errors and NULL returned from ptp_clock_register() differently,
fail init in case we get an error and continue in case we see NULL.
3) Leave things as they are.
4) Always require CONFIG_PTP_1588_CLOCK.

My personal preference would be 2 or 3. What do you think?

>> + }
>> +
>> return 0;
>> }

--
Vitaly