Re: [PATCH v2] KVM: LAPIC: Fix cancel preemption timer repeatedly due to preemption

From: Paolo Bonzini
Date: Mon Jul 24 2017 - 11:38:41 EST


On 24/07/2017 17:08, Wanpeng Li wrote:
> 2017-07-24 22:45 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>:
>> On 24/07/2017 10:57, Wanpeng Li wrote:
>>> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>>>
>>> Preemption can occur in the preemption timer expiration handler:
>>>
>>> CPU0 CPU1
>>>
>>> preemption timer vmexit
>>> handle_preemption_timer(vCPU0)
>>> kvm_lapic_expired_hv_timer
>>> hv_timer_is_use == true
>>> sched_out
>>> sched_in
>>> kvm_arch_vcpu_load
>>> kvm_lapic_restart_hv_timer
>>> restart_apic_timer
>>> start_hv_timer
>>> already-expired timer or sw timer triggerd in the window
>>> start_sw_timer
>>> cancel_hv_timer
>>
>> At this point, the timer interrupt is injected, right?
>
> Do you mean the new one on CPU1? I think we just set the pending
> timer, we return back to kvm_lapic_expired_hv_timer() after preempt
> notifier sched_in.

start_sw_timer calls apic_timer_expired after cancel_hv_timer.


>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>> index 2819d4c..8341b40 100644
>>> --- a/arch/x86/kvm/lapic.c
>>> +++ b/arch/x86/kvm/lapic.c
>>> @@ -1560,9 +1560,13 @@ void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu)
>>> {
>>> struct kvm_lapic *apic = vcpu->arch.apic;
>>>
>>> - WARN_ON(!apic->lapic_timer.hv_timer_in_use);
>>> - WARN_ON(swait_active(&vcpu->wq));
>>> - cancel_hv_timer(apic);
>>> + preempt_disable();
>>> + if (!(!apic_lvtt_period(apic) && atomic_read(&apic->lapic_timer.pending))) {
>>
>> Why is the "if" necessary?
>>
>> Maybe all of kvm_lapic_expired_hv_timer and start_sw_timer should be in
>> preemption-disabled regions, which trivially avoids any reentrancy issue
>> with the preempt notifier. Then, cancel_hv_timer can assert that it's
>> called with preemption disabled.
>
> For example:
>
> static int handle_preemption_timer(struct kvm_vcpu *vcpu)
> {
> --------------------------------------------------> We still can
> be preempted here, and do one cancel_hv_timer()

Yes, so that just means you can do


preempt_disable();
/* The preempt notifier has called apic_timer_expired already. */
if (!apic->lapic_timer.hv_timer_in_use)
goto out;

Thinking more about it, even _the caller_ of start_hv_timer and
start_sw_timer should be in a preemption-disabled region for simplicity.
That means that ultimately restart_apic_timer, kvm_lapic_expired_hv_timer
and kvm_lapic_switch_to_sw_timer should call preempt_disable/preempt_enable.

Paolo