Re: [PATCH 1/5] x86/kvm: On KVM re-enable (e.g. after suspend), update clocks
From: Radim Krcmar
Date: Thu Mar 17 2016 - 15:58:30 EST
2016-03-17 11:22-0700, Andy Lutomirski:
> On Mar 17, 2016 8:10 AM, "Radim Krcmar" <rkrcmar@xxxxxxxxxx> wrote:
>> 2016-03-16 16:07-0700, Andy Lutomirski:
>>> On Wed, Mar 16, 2016 at 3:59 PM, Radim Krcmar <rkrcmar@xxxxxxxxxx> wrote:
>>>> 2016-03-16 15:15-0700, Andy Lutomirski:
>>>>> FWIW, if you ever intend to support ART ("always running timer")
>>>>> passthrough, this is going to be a giant clusterfsck. Good luck. I
>>>>> haven't gotten a straight answer as to what hardware actually supports
>>>>> that thing, so even testing isn't no easy.
>>>>
>>>> Hm, AR TSC would be best handled by doing nothing ... dropping the
>>>> faking logic just became tempting.
>>
>> ART is different from what I initially thought, it's the underlying
>> mechanism for invariant TSC and nothing more ... we already forbid
>> migrations when the guest knows about invariant TSC, so we could do the
>> same and let ART be virtualized. (Suspend has to be forbidden too.)
>
> It's more than that -- it's a TSC-like clock that can be read by PCIe devices.
So ART is for time synchronization within the machine. Makes sense now.
>>> As it stands, ART is screwed if you adjust the VMCS's tsc offset. But
>>
>> Luckily, assigning real hardware can prevent migration or suspend, so we
>> won't need to adjust the offset during runtime. TSC is a generally
>> unmigratable device that just happens to live on the CPU.
>>
>> (It would have been better to hide TSC capability from the guest and only
>> use rdtsc for kvmclock if the guest wanted fancy features.)
>>
>
> I think that, if KVM passes through an ART-supporting NIC, it might be
> rather messy to try to avoid passing through TSC as well.
I agree. Migrating a guest with ART-supporting NIC is going to be hard
or impossible, so there is no big drawback in exposing TSC.
If KVM adds host TSC_ADJUST and VMCS TSC-offset to guest TSC_ADJUST,
then ART-supporting NIC should use timestamps compatible with VCPUs.
> But maybe a
> pvclock-like structure could expose the ART-kvmclock offset and scale.
I think that getting ART from kvmclock would turn out to be horrible.