Re: [patch 2/3] KVM: x86: add KVM_HC_CLOCK_OFFSET hypercall

From: Marcelo Tosatti
Date: Fri Jan 13 2017 - 11:00:09 EST


On Fri, Jan 13, 2017 at 04:31:58PM +0100, Radim Krcmar wrote:
> 2017-01-13 10:01-0200, Marcelo Tosatti:
> > Add a hypercall to retrieve the host realtime clock
> > and the TSC value used to calculate that clock read.
> >
> > Used to implement clock synchronization between
> > host and guest.
> >
> > Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
> >
> > ---
> > Documentation/virtual/kvm/hypercalls.txt | 30 ++++++++++++++++++++++
> > arch/x86/include/uapi/asm/kvm_para.h | 9 ++++++
> > arch/x86/kvm/x86.c | 41 +++++++++++++++++++++++++++++++
> > include/uapi/linux/kvm_para.h | 3 ++
> > 4 files changed, 83 insertions(+)
> >
> > Index: kvm-ptpdriver/arch/x86/include/uapi/asm/kvm_para.h
> > ===================================================================
> > --- kvm-ptpdriver.orig/arch/x86/include/uapi/asm/kvm_para.h 2017-01-13 09:03:33.823305270 -0200
> > +++ kvm-ptpdriver/arch/x86/include/uapi/asm/kvm_para.h 2017-01-13 09:07:54.633699270 -0200
> > @@ -50,6 +50,15 @@
> > __u32 pad[11];
> > };
> >
> > +#define KVM_CLOCK_OFFSET_WALLCLOCK 0
> > +struct kvm_clock_offset {
> > + __s64 sec;
> > + __s64 nsec;
> > + __u64 tsc;
> > + __u32 flags;
> > + __u32 pad[1];
> > +};
> > +
> > #define KVM_STEAL_ALIGNMENT_BITS 5
> > #define KVM_STEAL_VALID_BITS ((-1ULL << (KVM_STEAL_ALIGNMENT_BITS + 1)))
> > #define KVM_STEAL_RESERVED_MASK (((1 << KVM_STEAL_ALIGNMENT_BITS) - 1 ) << 1)
> > Index: kvm-ptpdriver/Documentation/virtual/kvm/hypercalls.txt
> > ===================================================================
> > --- kvm-ptpdriver.orig/Documentation/virtual/kvm/hypercalls.txt 2017-01-13 09:03:33.823305270 -0200
> > +++ kvm-ptpdriver/Documentation/virtual/kvm/hypercalls.txt 2017-01-13 09:15:32.971389949 -0200
> > @@ -81,3 +81,33 @@
> > same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
> > specifying APIC ID (a1) of the vcpu to be woken up. An additional argument (a0)
> > is used in the hypercall for future use.
> > +
> > +
> > +6. KVM_HC_CLOCK_OFFSET
> > +------------------------
> > +Architecture: x86
> > +Status: active
> > +Purpose: Hypercall used to synchronize host and guest clocks.
> > +Usage:
> > +
> > +a0: guest physical address where host copies
> > +"struct kvm_clock_offset" structure.
> > +
> > +a1: clock_type, ATM only KVM_CLOCK_OFFSET_WALLCLOCK (0)
> > +is supported (hosts CLOCK_REALTIME clock).
> > +
> > + struct kvm_clock_offset {
> > + __s64 sec;
> > + __s64 nsec;
>
> Why is nsec:
> 1) signed -- it is a remainder after division by NSEC_PER_SEC

Because "struct timespec" is signed because it can be used for
time deltas (you won't actually get signed values for
kvm_get_walltime_and_clockread).

Just wanted to match "struct timespec".

> 2) bigger than 32 bit -- NSEC_PER_SEC < 2^32
> ?

Again matching struct timespec.

> > + __u64 tsc;
> > + __u32 flags;
> > + __u32 pad;
> > + };
> > +
> > + Where:
> > + * sec: seconds from clock_type clock.
> > + * nsec: nanoseconds from clock_type clock.
>
> The important part of an offset is the starting point -- I assume it is
> the the usual one, but documentation better be explicit.

Don't get what you mean? (the values have same meaning as hosts
clock_gettime(CLOCK_REALTIME), supposedly that is clear).

> > + * tsc: TSC value used to calculate sec/nsec pair
>
> This hypercall could in theory work even when computing the offset with
> a different clocksource on the host and the estimating TSC at that time.

Yes it could.

> > + (this hypercall only works when host uses TSC clocksource).
>
> Describe that error as a return value.

Fixed.

+
+ if (kvm_get_walltime_and_clockread(&ts, &cycle) == false) {
+ kfree(clock_offset);
+ return -KVM_EOPNOTSUPP;
+ }

>
> > + * flags: flags, unused at the moment.
>
> unused = 0.

Fixed.

> > +
> > Index: kvm-ptpdriver/arch/x86/kvm/x86.c
> > ===================================================================
> > --- kvm-ptpdriver.orig/arch/x86/kvm/x86.c 2017-01-13 09:04:46.581415259 -0200
> > +++ kvm-ptpdriver/arch/x86/kvm/x86.c 2017-01-13 09:08:51.557785166 -0200
> > @@ -6121,6 +6121,44 @@
> > }
> > EXPORT_SYMBOL_GPL(kvm_emulate_halt);
> >
> > +static int kvm_pv_clock_offset(struct kvm_vcpu *vcpu, gpa_t paddr,
> > + unsigned long clock_type)
> > +{
> > + struct kvm_clock_offset *clock_offset;
> > + struct timespec ts;
> > + cycle_t cycle;
> > + int ret;
> > +
> > + if (clock_type != KVM_CLOCK_OFFSET_WALLCLOCK)
> > + return -KVM_EOPNOTSUPP;
> > +
> > + if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC)
> > + return -KVM_EOPNOTSUPP;
> > +
> > + clock_offset = kzalloc(sizeof(struct kvm_clock_offset), GFP_KERNEL);
>
> Can't clock_offset be on the stack?

Well since the kernel stack is limited, thought it would be safer to
allocate it.

> > + if (clock_offset == NULL)
> > + return -KVM_ENOMEM;
> > +
> > + if (kvm_get_walltime_and_clockread(&ts, &cycle) == false) {
> > + kfree(clock_offset);
> > + return -KVM_EOPNOTSUPP;
> > + }
> > +
> > + clock_offset->sec = ts.tv_sec;
> > + clock_offset->nsec = ts.tv_nsec;
> > + clock_offset->tsc = kvm_read_l1_tsc(vcpu, cycle);
> > + clock_offset->flags = 0;
> > +
> > + ret = 0;
> > + if (kvm_write_guest(vcpu->kvm, paddr, clock_offset,
> > + sizeof(struct kvm_clock_offset)))
> > + ret = -KVM_EFAULT;
> > +
> > + kfree(clock_offset);
> > +
> > + return ret;
> > +}
> > +
> > /*
> > * kvm_pv_kick_cpu_op: Kick a vcpu.
> > *
> > @@ -6185,6 +6223,9 @@
> > kvm_pv_kick_cpu_op(vcpu->kvm, a0, a1);
> > ret = 0;
> > break;
> > + case KVM_HC_CLOCK_OFFSET:
> > + ret = kvm_pv_clock_offset(vcpu, a0, a1);
> > + break;
> > default:
> > ret = -KVM_ENOSYS;
> > break;
> > Index: kvm-ptpdriver/include/uapi/linux/kvm_para.h
> > ===================================================================
> > --- kvm-ptpdriver.orig/include/uapi/linux/kvm_para.h 2017-01-13 09:03:33.823305270 -0200
> > +++ kvm-ptpdriver/include/uapi/linux/kvm_para.h 2017-01-13 09:07:54.636699274 -0200
> > @@ -14,6 +14,8 @@
> > #define KVM_EFAULT EFAULT
> > #define KVM_E2BIG E2BIG
> > #define KVM_EPERM EPERM
> > +#define KVM_EOPNOTSUPP EOPNOTSUPP
> > +#define KVM_ENOMEM ENOMEM
> >
> > #define KVM_HC_VAPIC_POLL_IRQ 1
> > #define KVM_HC_MMU_OP 2
> > @@ -23,6 +25,7 @@
> > #define KVM_HC_MIPS_GET_CLOCK_FREQ 6
> > #define KVM_HC_MIPS_EXIT_VM 7
> > #define KVM_HC_MIPS_CONSOLE_OUTPUT 8
> > +#define KVM_HC_CLOCK_OFFSET 9
> >
> > /*
> > * hypercalls use architecture specific
> >
> >