Re: [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct

From: Marcelo Tosatti
Date: Tue Feb 16 2016 - 09:26:20 EST


On Tue, Feb 16, 2016 at 02:48:16PM +0100, Marcelo Tosatti wrote:
> On Mon, Feb 08, 2016 at 04:18:31PM +0100, Paolo Bonzini wrote:
> > When an NTP server is running, it may adjust the time substantially
> > compared to the "official" frequency of the TSC. A 12 ppm change
> > sums up to one second per day.
> >
> > This already shows up if the guest compares kvmclock with e.g. the
> > PM timer. It shows up even more once we add support for the Hyper-V
> > TSC page, because the guest expects it to be in sync with the time
> > reference counter; effectively the time reference counter is just a
> > slow path to access the same clock that is in the TSC page.
> >
> > Therefore, we want kvmclock to provide the host kernel's
> > ktime_get_boot_ns() value, at least if the master clock is active.
> > To do so, reverse-compute the host's "actual" TSC frequency from
> > pvclock_gtod_data and return it from kvm_get_time_and_clockread.
>
> Paolo,
>
> You'd have to generate an update to the guest structures as well,
> to reflect the new {mult,shift} values calculated by the host.
> Here:
>
> /* disable master clock if host does not trust, or does not
> * use, TSC clocksource
> */
> if (gtod->clock.vclock_mode != VCLOCK_TSC &&
> atomic_read(&kvm_guest_has_master_clock) != 0)
> queue_work(system_long_wq, &pvclock_gtod_work);
>
> No?
>
> At first, i'm afraid this might be heavy, so it might be interesting
> to rate limit the update operation.
>

Paolo,

I suppose its not sufficient:

500ppm of 300 seconds = .0005*300 = 0.15 seconds.

Should aim at avoiding time backwards event in the following situation:


T1) t1_kvmclock_read = get_nanoseconds();
/* NTP correction to kernel clock = 500ppm */
/* TSC correction via mult,shift = 0ppm */

VM-exit, update kvmclock (or Hyper-V) clock data with
new values

T2) t2_kvmclock_read = get_nanoseconds();
/* NTP correction to kernel clock = 500ppm */
/* TSC correction via mult,shift = 500ppm */


So should not allow the host clock (or system_timestamp) to diverge
from (TSC based calculation) more than the duration of the event:

VM-exit, update kvmclock (or Hyper-V) with new data.

To avoid t2_kvmclock_read < t1_kvmclock_read