Re: [patch 3/3] PTP: add kvm PTP driver

From: Radim Krcmar
Date: Tue Jan 17 2017 - 10:38:45 EST


2017-01-17 09:30-0200, Marcelo Tosatti:
> On Tue, Jan 17, 2017 at 09:03:27AM +0100, Miroslav Lichvar wrote:
>> On Mon, Jan 16, 2017 at 06:01:14PM -0200, Marcelo Tosatti wrote:
>> > On Mon, Jan 16, 2017 at 05:47:15PM -0200, Marcelo Tosatti wrote:
>> > > On Mon, Jan 16, 2017 at 05:36:55PM -0200, Marcelo Tosatti wrote:
>> > > > Sorry, unless i am misunderstanding how this works, it'll get the guest clock
>> > > > 2us behind, which is something not wanted.
>> > > >
>> > > > Miroslav, if ->gettime64 returns the host realtime at 2us in the past,
>> > > > this means Chrony will sync the guest clock to
>> > > >
>> > > > host realtime - 2us
>> > > >
>> > > > Is that correct?
>>
>> Probably. It depends on the error of both host and guest timestamps.
>> If the error is the same on both sides, it will cancel out. An
>> occasional spike in the delay shouldn't be a problem as the reading
>> will be filtered out, but for best accuracy it's necessary that the
>> host's timestamp is taken in the middle between the guest's
>> timestamps.
>
> The problem is that spikes can be far from occasional: it depends on activity of
> the host CPU and interrupts. Whose delay can be "intermittent": as long
> as interrupts are being sent to the host CPU, for example, the delay
> will be high (which can last minutes).
>
> The TSC reading in the guest KVM PTP driver corrects for that delay.
>
>> Users of the PTP_SYS_OFFSET ioctl assume that (ts[0]+ts[2])/2
>> corresponds to ts[1], (ts[2]+ts[4])/2 corresponds to ts[3], and so on.
>>
>> ts[1] ts[3]
>> Host time ---------+---------+........
>> | |
>> | |
>> Guest time ----+---------+---------+......
>> ts[0] ts[2] ts[4]

KVM PTP delay moves host ts[i] to be close to guest ts[i+1] and makes
the offset very consistent, so the graph would look like:

ts[1] ts[3]
Host time -------------+---------+........
| |
| |
Guest time ----+---------+---------+......
ts[0] ts[2] ts[4]

which doesn't sound good if users assume that the host reading is in the
middle -- the guest time would be ahead of the host time.

I'm wondering why is the PTP precision around 10ns, when the hypercall
takes around 2-3k cycles. Have you measured the guest<->host offset by
getting the output of the hypercall, i.e.
{host_sec @ tsc, host_nsec @ tsc, tsc}
and comparing it with guest time computed from the same tsc, i.e.
{guest_sec @ tsc, guest_nsec @ tsc}
?

Thanks.