Re: [RFT PATCH v5 3/3] KVM: nVMX: keep preemption timer enabled during L2 execution

From: yunhong jiang
Date: Fri Jul 08 2016 - 17:34:49 EST


On Fri, 8 Jul 2016 19:41:48 +0200
Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:

>
>
> On 08/07/2016 19:29, yunhong jiang wrote:
> > >
> > > exec_control = vmcs12->pin_based_vm_exec_control;
> > > - exec_control |= vmcs_config.pin_based_exec_ctrl;
> > > +
> > > + /* Preemption timer setting is only taken from vmcs01.
> > > */ exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER;
> >
> > Do we still keep this clear here with followed changes?
>
> Yes. If L1 wants to use the preemption timer the bit will be set in
> vmcs12->pin_based_vm_exec_control In this case, however, KVM uses an
> hrtimer to emulate L1's preemption timer, so we must not copy the bit
> into the vmcs02 (i.e. the VMCS that L0 uses to run L2). Thus the
> preemption timer control of the vmcs02 must come exclusively from
> vmx->hv_deadline_tsc.

Thanks for the clarification.

>
> > > + exec_control |= vmcs_config.pin_based_exec_ctrl;
> > > + if (vmx->hv_deadline_tsc == -1)
> > > + exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER;
> > >
> > > + /* Posted interrupts setting is only taken from vmcs12.
> > > */ if (nested_cpu_has_posted_intr(vmcs12)) {
> > > /*
> > > * Note that we use L0's vector here and in
> > > @@ -10727,8 +10732,14 @@ static void nested_vmx_vmexit(struct
> > > kvm_vcpu *vcpu, u32 exit_reason,
> > > load_vmcs12_host_state(vcpu, vmcs12);
> > >
> > > - /* Update TSC_OFFSET if TSC was changed while L2 ran */
> > > + /* Update any VMCS fields that might have changed while
> > > L2 ran */ vmcs_write64(TSC_OFFSET, vmx->nested.vmcs01_tsc_offset);
> > > + if (vmx->hv_deadline_tsc == -1)
> > > + vmcs_clear_bits(PIN_BASED_VM_EXEC_CONTROL,
> > > + PIN_BASED_VMX_PREEMPTION_TIMER);
> > > + else
> > > + vmcs_set_bits(PIN_BASED_VM_EXEC_CONTROL,
> > > + PIN_BASED_VMX_PREEMPTION_TIMER);
> >
> > Why do we need change the vmcs01 here? Per my understanding, the
> > vmcs01 is not changed when the L2 guest is running thus the
> > PIN_BASED_VM_EXEC_CONTROL should not be changed?
>
> This is the point where we are updating the vmcs01 after exiting. If
> vmx->hv_deadline_tsc has changed (for example because of a preemption


Thanks for the explaination. I try to go through the code and still
have one question. I'd describe below and hope get your input.

When the L2 guest running while the VMX Preemption timer triggered, the
vcpu_enter_guest() will trigger vmx_handle_exit(), with the CPU vmcs as
vmcs02. On the vmx_handle_exit(), the nested_vmx_exit_handled() return
false as the 1st patch did, thus the vmcs is not switched. The
kvm_lapic_expired_hv_timer() will be called with vmcs02, instead of
vmcs01. Is it something we wanted? I assume we should use vmcs01 there
since we will clear the preemption timer VMCS bit there.

Thanks
--jyh

> timer vmexit, or because L2 did a HLT and L1 is not intercepting HLT)
> we need to update the preemption timer control to synchronize it with
> vmx->hv_deadline_tsc.
>
> > I'm not familiar with nested VMX, sorry if this is a naive question.
>
> It's not naive, don't worry! :)
>
> Paolo