Re: [PATCH v6 3/3] KVM: LAPIC: Apply change to TDCR right away to the timer

From: Radim KrÄmÃÅ
Date: Fri Oct 06 2017 - 09:15:03 EST


2017-10-05 18:54-0700, Wanpeng Li:
> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>
> The description in the Intel SDM of how the divide configuration
> register is used: "The APIC timer frequency will be the processor's bus
> clock or core crystal clock frequency divided by the value specified in
> the divide configuration register."
>
> Observation of baremetal shown that when the TDCR is change, the TMCCT
> does not change or make a big jump in value, but the rate at which it
> count down change.
>
> The patch update the emulation to APIC timer to so that a change to the
> divide configuration would be reflected in the value of the counter and
> when the next interrupt is triggered.
>
> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> Cc: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
> ---
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> @@ -1458,6 +1458,36 @@ static void start_sw_period(struct kvm_lapic *apic)
> HRTIMER_MODE_ABS_PINNED);
> }
>
> +static bool update_target_expiration(struct kvm_lapic *apic, uint32_t old_divisor)
> +{
> + ktime_t now, remaining;
> + u64 tscl = rdtsc(), delta;
> +
> + now = ktime_get();
> + remaining = ktime_sub(apic->lapic_timer.target_expiration, now);
> + if (ktime_to_ns(remaining) < 0)
> + remaining = 0;
> + delta = mod_64(ktime_to_ns(remaining), apic->lapic_timer.period);

Hm, can this happen?

> + if (!delta)
> + return false;
> +
> + apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
> + * APIC_BUS_CYCLE_NS * apic->divide_count;

I think that it would be safer to always modify the period.

> + delta = delta * apic->divide_count / old_divisor;
> +
> + if (!apic->lapic_timer.period)
> + return false;
> +
> + limit_periodic_timer_frequency(apic);
> +
> + apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
> + nsec_to_cycles(apic->vcpu, delta);

We could do that without rdtsc() for added precision and maybe
performance:

apic->lapic_timer.tscdeadline += nsec_to_cycles(apic->vcpu, delta) -
nsec_to_cycles(apic->vcpu, remaining);

// not sure how a negative operand would behave:
// nsec_to_cycles(apic->vcpu, delta - remaining)

> + apic->lapic_timer.target_expiration = ktime_add_ns(now, delta);
> +
> + return true;
> +}
> +
> static bool set_target_expiration(struct kvm_lapic *apic)
> {
> ktime_t now;
> @@ -1750,13 +1780,20 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
> start_apic_timer(apic);
> break;
>
> - case APIC_TDCR:
> + case APIC_TDCR: {
> + uint32_t old_divisor = apic->divide_count;
> +
> if (val & 4)
> apic_debug("KVM_WRITE:TDCR %x\n", val);
> kvm_lapic_set_reg(apic, APIC_TDCR, val);
> update_divide_count(apic);
> + if (apic->divide_count != old_divisor) {
> + hrtimer_cancel(&apic->lapic_timer.timer);
> + if (update_target_expiration(apic, old_divisor))
> + restart_apic_timer(apic);

I think we can lose a timer here when we cancel a hrtimer whose
expiration time passes before update_target_expiration(), so it never
gets restarted.

Doing restart_apic_timer() unconditionally seems better. It behaves
well if we try to restart a timer that has already fired.

Thanks.

> + }
> break;
> -
> + }
> case APIC_ESR:
> if (apic_x2apic_mode(apic) && val != 0) {
> apic_debug("KVM_WRITE:ESR not zero %x\n", val);
> --
> 2.7.4
>