Re: [PATCH 1/5] KVM: nSVM: Sync next_rip field from vmcb12 to vmcb02

From: Sean Christopherson
Date: Fri Apr 01 2022 - 14:32:58 EST


On Thu, Mar 10, 2022, Maciej S. Szmigiero wrote:
> From: "Maciej S. Szmigiero" <maciej.szmigiero@xxxxxxxxxx>
>
> The next_rip field of a VMCB is *not* an output-only field for a VMRUN.
> This field value (instead of the saved guest RIP) in used by the CPU for
> the return address pushed on stack when injecting a software interrupt or
> INT3 or INTO exception.
>
> Make sure this field gets synced from vmcb12 to vmcb02 when entering L2 or
> loading a nested state.
>
> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@xxxxxxxxxx>
> ---
> arch/x86/kvm/svm/nested.c | 4 ++++
> arch/x86/kvm/svm/svm.h | 1 +
> 2 files changed, 5 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index d736ec6514ca..9656f0d6815c 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -366,6 +366,7 @@ void __nested_copy_vmcb_control_to_cache(struct kvm_vcpu *vcpu,
> to->nested_ctl = from->nested_ctl;
> to->event_inj = from->event_inj;
> to->event_inj_err = from->event_inj_err;
> + to->next_rip = from->next_rip;
> to->nested_cr3 = from->nested_cr3;
> to->virt_ext = from->virt_ext;
> to->pause_filter_count = from->pause_filter_count;
> @@ -638,6 +639,8 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm)
> svm->vmcb->control.int_state = svm->nested.ctl.int_state;
> svm->vmcb->control.event_inj = svm->nested.ctl.event_inj;
> svm->vmcb->control.event_inj_err = svm->nested.ctl.event_inj_err;
> + /* The return address pushed on stack by the CPU for some injected events */
> + svm->vmcb->control.next_rip = svm->nested.ctl.next_rip;

This needs to be gated by nrips being enabled _and_ exposed to L1, i.e.

if (svm->nrips_enabled)
vmcb02->control.next_rip = svm->nested.ctl.next_rip;

though I don't see any reason to add the condition to the copy to/from cache flows.

> if (!nested_vmcb_needs_vls_intercept(svm))
> svm->vmcb->control.virt_ext |= VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK;
> @@ -1348,6 +1351,7 @@ static void nested_copy_vmcb_cache_to_control(struct vmcb_control_area *dst,
> dst->nested_ctl = from->nested_ctl;
> dst->event_inj = from->event_inj;
> dst->event_inj_err = from->event_inj_err;
> + dst->next_rip = from->next_rip;
> dst->nested_cr3 = from->nested_cr3;
> dst->virt_ext = from->virt_ext;
> dst->pause_filter_count = from->pause_filter_count;
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 93502d2a52ce..f757400fc933 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -138,6 +138,7 @@ struct vmcb_ctrl_area_cached {
> u64 nested_ctl;
> u32 event_inj;
> u32 event_inj_err;
> + u64 next_rip;
> u64 nested_cr3;
> u64 virt_ext;
> u32 clean;

I don't know why this struct has

u8 reserved_sw[32];

but presumably it's for padding, i.e. probably should be reduced to 24 bytes.