Re: [PATCH v2 1/3] KVM: x86: implement KVM_{GET|SET}_TSC_STATE

From: Maxim Levitsky
Date: Thu Dec 10 2020 - 09:54:57 EST


On Thu, 2020-12-10 at 12:48 +0100, Paolo Bonzini wrote:
> On 08/12/20 22:20, Thomas Gleixner wrote:
> > So now life migration comes a long time after timekeeping had set the
> > limits and just because it's virt it expects that everything works and it
> > just can ignore these limits.
> >
> > TBH. That's not any different than SMM or hard/firmware taking the
> > machine out for lunch. It's exactly the same: It's broken.
>
> I agree. If *live* migration stops the VM for 200 seconds, it's broken.
>
> Sure, there's the case of snapshotting the VM over the weekend. My
> favorite solution would be to just put it in S3 before doing that. *Do
> what bare metal does* and you can't go that wrong.

Note though that qemu has a couple of issues with s3, and it is disabled
by default in libvirt.
I would be very happy to work on improving this if there is a need for that.


>
> In general it's userspace policy whether to keep the TSC value the same
> across live migration. There's pros and cons to both approaches, so KVM
> should provide the functionality to keep the TSC running (which the
> guest will see as a very long, but not extreme SMI), and this is what
> this series does. Maxim will change it to operate per-VM. Thanks
> Thomas, Oliver and everyone else for the input.

I agree with that.

I still think though that we should have a discussion on feasibility
of making the kernel time code deal with large *forward* tsc jumps
without crashing.

If that is indeed hard to do, or will cause performance issues,
then I agree that we might indeed inform the guest of time jumps instead.

In fact kvmclock already have such a mechanism (KVM_KVMCLOCK_CTRL ioctl, which sets
the PVCLOCK_GUEST_STOPPED bit in the PV clock struct).
That informs the guest that it was stopped (guest clears this bit),
and currently that makes the guest touch various watchdogs.

I think that the guest uses it only when kvmclock is used but
we can think about extending this to make guest use it
even when bare tsc is used, and also implement whatever logic is
needed to jump the guest clock forward when this bit is set.

What do you think?

Best regards,
Maxim Levitsky

>
> Paolo
>