Re: [PATCH v2 1/3] KVM: x86: implement KVM_{GET|SET}_TSC_STATE
From: Andy Lutomirski
Date: Tue Dec 08 2020 - 12:44:47 EST
On Tue, Dec 8, 2020 at 6:23 AM Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:
>
> On Mon, Dec 07, 2020 at 10:04:45AM -0800, Andy Lutomirski wrote:
> >
> >
> > I do have a feature request, though: IMO it would be quite nifty if the new kvmclock structure could also expose NTP corrections. In other words, if you could expose enough info to calculate CLOCK_MONOTONIC_RAW, CLOCK_MONOTONIC, and CLOCK_REALTIME, then we could have paravirt NTP.
>
> Hi Andy,
>
> Any reason why drivers/ptp/ptp_kvm.c does not work for you?
>
It looks like it tries to accomplish the right goal, but in a rather
roundabout way. The host knows how to convert from TSC to
CLOCK_REALTIME, and ptp_kvm.c exposes this data to the guest. But,
rather than just making the guest use the same CLOCK_REALTIME data as
the host, ptp_kvm.c seems to expose information to usermode that a
user daemon could use to attempt (with some degree of error?) to use
to make the guest kernel track CLOCK_REALTIME. This seems inefficient
and dubiously accurate.
My feature request is for this to be fully automatic and completely
coherent. I would like for a host user program and a guest user
program to be able to share memory, run concurrently, and use the
shared memory to exchange CLOCK_REALTIME values without ever observing
the clock going backwards. This ought to be doable. Ideally the
result should even be usable for Spanner-style synchronization
assuming the host clock is good enough. Also, this whole thing should
work without needing to periodically wake the guest to remain
synchronized. If the guest sleeps for two minutes (full nohz-idle, no
guest activity at all), the host makes a small REALTIME frequency
adjustment, and then the guest runs user code that reads
CLOCK_REALTIME, the guest clock should still be fully synchronized
with the host. I don't think that ptp_kvm.c-style synchronization can
do this.
tglx etc, I think that doing this really really nicely might involve
promoting something like the current vDSO data structures to ABI -- a
straightforward-ish implementation would be for the KVM host to export
its vvar clock data to the guest and for the guest to use it, possibly
with an offset applied. The offset could work a lot like timens works
today.
--Andy