Re: [PATCH] KVM: x86: fix kvmclock breakage from timers branch merge

From: Thomas Gleixner
Date: Thu Sep 04 2014 - 16:58:48 EST


On Thu, 4 Sep 2014, Paolo Bonzini wrote:

> Commit cbcf2dd3b3d4 (x86: kvm: Make kvm_get_time_and_clockread() nanoseconds
> based, 2014-07-16) forgot to add tk->xtime_sec, thus breaking kvmclock on

Errm. How is boottime related to xtime_sec?

> hosts that have a reliable TSC. Add it back; and since the field boot_ns
> is not anymore related to the host boot-based clock, rename boot_ns->nsec_base
> and the existing nsec_base->snsec_base.

This is simply wrong.

The original code before that changed did:

vdata->monotonic_time_sec = tk->xtime_sec
+ tk->wall_to_monotonic.tv_sec;
vdata->monotonic_time_snsec = tk->xtime_nsec
+ (tk->wall_to_monotonic.tv_nsec
<< tk->shift);
So this is the momentary monotonic base time

And the readout function did:

ts->tv_nsec = 0;
do {
seq = read_seqcount_begin(&gtod->seq);
mode = gtod->clock.vclock_mode;
ts->tv_sec = gtod->monotonic_time_sec;
ns = gtod->monotonic_time_snsec;
ns += vgettsc(cycle_now);
ns >>= gtod->clock.shift;
} while (unlikely(read_seqcount_retry(&gtod->seq, seq)));
timespec_add_ns(ts, ns);

So this does:

now = monotonic_base + delta_nsec

And the caller converted it to boot time with:

monotonic_to_bootbased(&ts);

So the time calculation does:

now = monotonic_base + delta_nsec + mono_to_boot

Because: monotonic_base + mono_to_boot = boot_time_base

The calculation can be written as:

now = boot_time_base + delta_nsec


The new code does

boot_ns = ktime_to_ns(ktime_add(tk->base_mono, tk->offs_boot));

So thats

boot_time_base = monotonic_base + mono_to_boot;

vdata->boot_ns = boot_ns;
vdata->nsec_base = tk->tkr.xtime_nsec;

And the readout does:

do {
seq = read_seqcount_begin(&gtod->seq);
mode = gtod->clock.vclock_mode;
ns = gtod->nsec_base;
ns += vgettsc(cycle_now);
ns >>= gtod->clock.shift;
ns += gtod->boot_ns;
} while (unlikely(read_seqcount_retry(&gtod->seq, seq)));
*t = ns;

Which is:

boot_time_base + delta_nsec

Now I have no idea why you think it needs to add xtime_sec. If the
result is wrong, then we need to figure out which one of the supplied
values is wrong and not blindly add xtime_sec just because that makes
it magically correct.

Can you please provide a proper background why you think that adding
xtime_sec is a good idea?

Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/