Re: [PATCH v2 01/14] KVM: x86: change PIT discard tick policy

From: Radim KrÄmÃÅ
Date: Fri Feb 26 2016 - 08:44:56 EST


2016-02-25 20:11+0100, Paolo Bonzini:
>> 2016-02-25 14:38+0100, Paolo Bonzini:
>>> On 19/02/2016 15:44, Radim KrÄmÃÅ wrote:
>>> So we can change QEMU's kvm-i8254 to accept "slew" and warn if
>>> "delay" is given.
>> **
>> QEMU 4e4fa398db69 ("qdev: Introduce lost tick policy property") defines:
>>
>> delay - replay all lost ticks in a row once the guest accepts them
>> again
>> slew - lost ticks are gradually replayed at a higher frequency than
>> the original tick
>>
>> "delay" is exactly how kvm-i8254 behaves (in its "reinject" mode), so I
>> think we shouldn't change it.
>
> Ooh, I missed this commit message indeed. Then libvirt delay != QEMU
> delay, isn't it?

Exactly, it's sad. libvirt delay -> QEMU discard.

(Lost) tick policy relations look like this
libvirt | QEMU
catchup -> delay
delay -> discard
discard -> n/a
catchup -> slew
delay -> merge (?)
merge -> n/a (?)

Delay, discard, and merge are too ambiguous to make sense.

>> [...]
>> and this is incompatible with libvirt's definition of "discard"
>>
>> The guest time may be delayed, unless the OS has explicit handling of
>> lost ticks.
>>
>> "may" doesn't fit. You can only say
>> - the guest time is delayed.
>>
>> which is best described by "delay".
>
> I think we can safely ignore the "may be" -- you cannot say for sure
> that the guest time "will" be delayed since you could always have a very
> enlightened guest.
> ... but then, by removing the handwavy "may be" would you say that
> libvirt delay and libvirt discard are the same?

I would, which is why the "may be" is significant -- the timer has to
provide tools to the guest, even if the guest ignores them.

Do you agree that following rephrasing is identical to libvirt discard?
Lost ticks will delay the guest time, unless the guest OS has explicit
handling of lost ticks.

> Would 0, 42, 62, 82 be
> a valid implementation of the libvirt "delay" policy?

Distinguishing it from saner variants (0, 60, 80 or 0, 20/42, 60, 80)
would be nice, because shifting the phase is not "continuing at normal
rate" for me, but I'd frown and accept it ...
The important part is that guest time never recovers from the lost tick.

>>> Therefore, it _also_ happens that thanks to IRR and NMI latching you can
>>> implement "merge" without having that kind of relationship between the
>>> timer device and the interrupt controller.
>>
>> I disagree. IRR can catch at most one interrupt, so it is insufficient
>> to implement libvirt's merge. (libvirt's merge also has the conditional
>> "The guest time may be delayed".)
>
> Hmm... is your point that the i8254 _alone_ is implementing discard, and
> the tick delivery time is _actually_ 0, 20, 60, 80 (and the t=20 tick is
> delivered late but not lost due to the i8259 buffer)? And hence the
> QEMU device model should see it as discard. I can definitely agree with
> that.

Yes. Only the tick at 40 is lost.

> There is still the matter of:
>
> - improving the documentation

Absolutely, this email thread shouldn't have existed.

QEMU merge policy isn't defined de facto ... do we say that it falls
into libvirt delay? (Or just remove it?)

merge - if multiple ticks are lost, all of them are merged into one
which is replayed once the guest accepts it again

> - clarifying the meaning of libvirt delay

Yes, I wouldn't mind excluding 0, 42, 62, 82. :)

> - deciding whether it's worth changing the meaning of QEMU delay to
> match libvirt's (and the default kvm-pit policy from delay to slew)

Changing QEMU's lost_tick_policy seems like a recipe for confusion.
It'd be nice to unify QEMU and libvirt terms, but QEMU does care about
backward compatibility and I think it is not wise to do this change
without a new interface. I'd rather just document and forget. :)

> But if we can agree on this, I can apply patch 1 as is, even for 4.5.

I think we agree on all parts that affect this series.
I'll start preparing v3. (Likely posting on Tuesday/Wednesday.)

Thank you.