Re: [patch 5/5] PTP: add kvm PTP driver

From: Radim Krcmar
Date: Fri Jan 20 2017 - 09:13:04 EST


2017-01-20 10:20-0200, Marcelo Tosatti:
> Add a driver with gettime method returning hosts realtime clock.
> This allows Chrony to synchronize host and guest clocks with
> high precision (see results below).
>
> chronyc> sources
> MS Name/IP address Stratum Poll Reach LastRx Last sample
> ===============================================================================
> #* PHC0 0 3 377 4 +162ns[ -683ns] +/- 11ns
>
> To configure Chronyd to use PHC refclock, add the
> following line to its configuration file:
>
> refclock PHC /dev/ptpX poll 3 dpoll -2 offset 0
>
> Where /dev/ptpX is the kvmclock PTP clock.
>
> Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
>
> ---
> drivers/ptp/Kconfig | 12 ++
> drivers/ptp/Makefile | 1
> drivers/ptp/ptp_kvm.c | 213 ++++++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 226 insertions(+)
>
> v2: check for kvmclock (Radim)
> initialize global variables before device registration (Radim)
> v3: use cross timestamps callback (Paolo, Miroslav, Radim)
>
> Index: kvm-ptpdriver/drivers/ptp/ptp_kvm.c
> ===================================================================
> --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> +++ kvm-ptpdriver/drivers/ptp/ptp_kvm.c 2017-01-20 10:19:20.555311672 -0200
> @@ -0,0 +1,213 @@
> +/*
> + * Virtual PTP 1588 clock for use with KVM guests
> + *
> + * Copyright (C) 2017 Red Hat Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + */
> +#include <linux/device.h>
> +#include <linux/err.h>
> +#include <linux/init.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <uapi/linux/kvm_para.h>
> +#include <asm/kvm_para.h>
> +#include <asm/pvclock.h>
> +#include <asm/kvmclock.h>
> +#include <uapi/asm/kvm_para.h>
> +
> +#include <linux/ptp_clock_kernel.h>
> +
> +struct kvm_ptp_clock {
> + struct ptp_clock *ptp_clock;
> + struct ptp_clock_info caps;
> +};
> +
> +DEFINE_SPINLOCK(kvm_ptp_lock);
> +
> +static struct pvclock_vsyscall_time_info *hv_clock;
> +
> +static struct kvm_clock_offset clock_off;
> +static phys_addr_t clock_off_gpa;
> +
> +/*
> + * system_counterval.cycles: kvmclock value com TSC do host.
> + * system_counterval.cs: kvmclock clocksource.
> + * device_time: host realtime clock.
> + *
> + */
> +static int ptp_kvm_get_time_fn(ktime_t *device_time,
> + struct system_counterval_t *system_counter,
> + void *ctx)
> +{
> + unsigned long ret;
> + struct timespec64 tspec;
> + unsigned version;
> + u8 flags;
> + int cpu;
> + struct pvclock_vcpu_time_info *src;
> +
> + preempt_disable_notrace();
> + cpu = smp_processor_id();
> + src = &hv_clock[cpu].pvti;
> +
> + spin_lock(&kvm_ptp_lock);

What does the lock prevent?

> +
> + do {
> + /*
> + * We are measuring the delay between
> + * kvm_hypercall and rdtsc using TSC,
> + * and converting that delta to
> + * tsc_to_system_mul and tsc_shift
> + * So any changes to tsc_to_system_mul
> + * and tsc_shift in this region
> + * invalidate the measurement.
> + */
> + version = pvclock_read_begin(src);
> +
> + ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING,
> + clock_off_gpa,
> + KVM_CLOCK_PAIRING_WALLCLOCK);
> + if (ret != 0) {
> + pr_err("clock offset hypercall ret %lu\n", ret);
> + spin_unlock(&kvm_ptp_lock);
> + preempt_enable_notrace();
> + return -EOPNOTSUPP;
> + }
> +
> + tspec.tv_sec = clock_off.sec;
> + tspec.tv_nsec = clock_off.nsec;
> + ret = __pvclock_read_cycles(src, clock_off.tsc);
> + flags = src->flags;
> + } while (pvclock_read_retry(src, version));
> +
> + preempt_enable_notrace();
> +
> + system_counter->cycles = ret;
> + system_counter->cs = get_kvmclock_cs();

Can't we use clocksource_tsc and just pass the tsc without kvmclock in
the middle?

> + tspec.tv_nsec = tspec.tv_nsec;

(This looks extraneous.)

Thanks.