Re: [RFT PATCH v5 3/3] KVM: nVMX: keep preemption timer enabled during L2 execution

From: yunhong jiang
Date: Fri Jul 08 2016 - 13:33:40 EST


On Fri, 8 Jul 2016 14:02:13 +0200
Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:

> Because the vmcs12 preemption timer is emulated through a separate
> hrtimer, we can keep on using the preemption timer in the vmcs02 to
> emulare L1's TSC deadline timer.
>
> However, the corresponding bit in the pin-based execution control
> field must be kept consistent between vmcs01 and vmcs02. On vmentry
> we copy it into the vmcs02; on vmexit the preemption timer must be
> disabled in the vmcs01 if a preemption timer vmexit happened while in
> guest mode.
>
> The preemption timer value in the vmcs02 is set by vmx_vcpu_run, so it
> need not be considered in prepare_vmcs02.
>
> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---
> arch/x86/kvm/vmx.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 0048be79c7b9..8cda4449a60e 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -9796,9 +9796,14 @@ static void prepare_vmcs02(struct kvm_vcpu
> *vcpu, struct vmcs12 *vmcs12) vmcs_write64(VMCS_LINK_POINTER, -1ull);
>
> exec_control = vmcs12->pin_based_vm_exec_control;
> - exec_control |= vmcs_config.pin_based_exec_ctrl;
> +
> + /* Preemption timer setting is only taken from vmcs01. */
> exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER;

Do we still keep this clear here with followed changes?

> + exec_control |= vmcs_config.pin_based_exec_ctrl;
> + if (vmx->hv_deadline_tsc == -1)
> + exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER;
>
> + /* Posted interrupts setting is only taken from vmcs12. */
> if (nested_cpu_has_posted_intr(vmcs12)) {
> /*
> * Note that we use L0's vector here and in
> @@ -10727,8 +10732,14 @@ static void nested_vmx_vmexit(struct
> kvm_vcpu *vcpu, u32 exit_reason,
> load_vmcs12_host_state(vcpu, vmcs12);
>
> - /* Update TSC_OFFSET if TSC was changed while L2 ran */
> + /* Update any VMCS fields that might have changed while L2
> ran */ vmcs_write64(TSC_OFFSET, vmx->nested.vmcs01_tsc_offset);
> + if (vmx->hv_deadline_tsc == -1)
> + vmcs_clear_bits(PIN_BASED_VM_EXEC_CONTROL,
> + PIN_BASED_VMX_PREEMPTION_TIMER);
> + else
> + vmcs_set_bits(PIN_BASED_VM_EXEC_CONTROL,
> + PIN_BASED_VMX_PREEMPTION_TIMER);

Why do we need change the vmcs01 here? Per my understanding, the vmcs01 is not
changed when the L2 guest is running thus the PIN_BASED_VM_EXEC_CONTROL should
not be changed? I'm not familiar with nested VMX, sorry if this is a naive
question.

Thanks
--jyh