Re: [PATCH] KVM: arm/arm64: don't set vtimer->cnt_ctl in kvm_arch_timer_handler

From: Christoffer Dall
Date: Wed Dec 13 2017 - 04:34:37 EST


On Wed, Dec 13, 2017 at 10:27 AM, Marc Zyngier <marc.zyngier@xxxxxxx> wrote:
> On 13/12/17 09:08, Auger Eric wrote:
>> Marc,
>> On 13/12/17 09:56, Marc Zyngier wrote:
>>> Hi Jia,
>>>
>>> On 13/12/17 07:00, Jia He wrote:
>>>> In our Armv8a server (qualcomm Amberwing, non VHE), after applying
>>>> Christoffer's timer optimizing patchset(Optimize arch timer register
>>>> handling), the guest is hang during kernel booting.
>>>>
>>>> The error root cause might be as follows:
>>>> 1. in kvm_arch_timer_handler, it reset vtimer->cnt_ctl with current
>>>> cntv_ctl register value. And then it missed some cases to update timer's
>>>> irq (irq.level) when kvm_timer_irq_can_fire() is false
>>>> 2. It causes kvm_vcpu_check_block return 0 instead of -EINTR
>>>> kvm_vcpu_check_block
>>>> kvm_cpu_has_pending_timer
>>>> kvm_timer_is_pending
>>>> kvm_timer_should_fire
>>>> 3. Thus, the kvm hyp code can not break the loop in kvm_vcpu_block (halt
>>>> poll process) and the guest is hang forever
>>>>
>>>> Fixes: b103cc3f10c0 ("KVM: arm/arm64: Avoid timer save/restore in vcpu entry/exit")
>>>> Signed-off-by: Jia He <jia.he@xxxxxxxxxxxxxxxx>
>>>> ---
>>>> virt/kvm/arm/arch_timer.c | 1 -
>>>> 1 file changed, 1 deletion(-)
>>>>
>>>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>>>> index f9555b1..bb86433 100644
>>>> --- a/virt/kvm/arm/arch_timer.c
>>>> +++ b/virt/kvm/arm/arch_timer.c
>>>> @@ -100,7 +100,6 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
>>>> vtimer = vcpu_vtimer(vcpu);
>>>>
>>>> if (!vtimer->irq.level) {
>>>> - vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
>>>> if (kvm_timer_irq_can_fire(vtimer))
>>>> kvm_timer_update_irq(vcpu, true, vtimer);
>>>> }
>>>>
>>>
>>> Which patches are you looking at? The current code in mainline looks
>>> like this:
>>>
>>> vtimer = vcpu_vtimer(vcpu);
>>>
>>> vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
>>> if (kvm_timer_irq_can_fire(vtimer))
>>> kvm_timer_update_irq(vcpu, true, vtimer);
>>>
>>> I'd suggest you use mainline and report if this doesn't work
>> the removal of if (!vtimer->irq.level) test happened in:
>> [PATCH v7 3/8] KVM: arm/arm64: Don't cache the timer IRQ level
>>
>> which is not upstream.
> Ah, my bad (I have that series in my working tree already...).
>
> I still think Jia's approach to this is not quite right. If you don't
> update the status of the timer by reading the HW value, how can you
> decide whether the timer can fire or not?
>

Exactly. We need to know the exact kernel source, symptoms, how to
reproduce, and then trace what's going on. It may be needed to tweak
kvm_timer_is_pending(), but I don't yet see a case where it breaks.

Thanks,
-Christoffer