Re: [PATCH v2] kvm: nVMX: Set nested_run_pending in vmx_set_nested_state after checks complete

From: Aaron Lewis
Date: Wed May 08 2019 - 17:15:21 EST


From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
Date: Wed, May 8, 2019 at 11:13 AM
To: Aaron Lewis
Cc: Paolo Bonzini, <linux-kernel@xxxxxxxxxxxxxxx>,
<kvm@xxxxxxxxxxxxxxx>, Peter Shier

> On Wed, May 08, 2019 at 10:53:12AM -0700, Aaron Lewis wrote:
> > nested_run_pending is also checked in
> > nested_vmx_check_vmentry_postreqs
> > (https://elixir.bootlin.com/linux/v5.1/source/arch/x86/kvm/vmx/nested.c#L2709)
> > so I think the setting needs to be moved to just prior to that call
> > with Paolo's rollback along with another for if the prereqs and
> > postreqs fail. I put a patch together below:
>
> Gah, I missed that usage (also, it's now nested_vmx_check_guest_state()).
>
> Side topic, I think the VM_ENTRY_LOAD_BNDCFGS check should be gated by
> nested_run_pending, a la the EFER check.'
>
> > ------------------------------------
> >
> > nested_run_pending=1 implies we have successfully entered guest mode.
> > Move setting from external state in vmx_set_nested_state() until after
> > all other checks are complete.
> >
> > Signed-off-by: Aaron Lewis <aaronlewis@xxxxxxxxxx>
> > Reviewed-by: Peter Shier <pshier@xxxxxxxxxx>
> > ---
> > arch/x86/kvm/vmx/nested.c | 14 +++++++++-----
> > 1 file changed, 9 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index 6401eb7ef19c..cf1f810223d2 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -5460,9 +5460,6 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
> > if (!(kvm_state->flags & KVM_STATE_NESTED_GUEST_MODE))
> > return 0;
> >
> > - vmx->nested.nested_run_pending =
> > - !!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
>
> Alternatively, it might be better to leave nested_run_pending where it
> is and instead add a label to handle clearing the flag on error. IIUC,
> the real issue is that nested_run_pending is left set after a failed
> vmx_set_nested_state(), not that its shouldn't be set in the shadow
> VMCS handling.
>
> Patch attached, though it's completely untested. The KVM selftests are
> broken for me right now, grrr.
>
> > -
> > if (nested_cpu_has_shadow_vmcs(vmcs12) &&
> > vmcs12->vmcs_link_pointer != -1ull) {
> > struct vmcs12 *shadow_vmcs12 = get_shadow_vmcs12(vcpu);
> > @@ -5480,14 +5477,21 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
> > return -EINVAL;
> > }
> >
> > + vmx->nested.nested_run_pending =
> > + !!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
> > +
> > if (nested_vmx_check_vmentry_prereqs(vcpu, vmcs12) ||
> > - nested_vmx_check_vmentry_postreqs(vcpu, vmcs12, &exit_qual))
> > + nested_vmx_check_vmentry_postreqs(vcpu, vmcs12, &exit_qual)) {
> > + vmx->nested.nested_run_pending = 0;
> > return -EINVAL;
> > + }
> >
> > vmx->nested.dirty_vmcs12 = true;
> > ret = nested_vmx_enter_non_root_mode(vcpu, false);
> > - if (ret)
> > + if (ret) {
> > + vmx->nested.nested_run_pending = 0;
> > return -EINVAL;
> > + }
> >
> > return 0;
> > }

Here is an update based on your patch. I ran these changes against
the test vmx_set_nested_state_test, and it run successfully.

That's correct that we are only concerned about restoring the state of
nested_run_pending, so it's fine to set it where we do as long as we
back the state change out before returning if we get an error.

---------------------------------------------

nested_run_pending=1 implies we have successfully entered guest mode.
Move setting from external state in vmx_set_nested_state() until after
all other checks are complete.

Signed-off-by: Aaron Lewis <aaronlewis@xxxxxxxxxx>
Tested-by: Aaron Lewis <aaronlewis@xxxxxxxxxx>
Reviewed-by: Peter Shier <pshier@xxxxxxxxxx>
---
arch/x86/kvm/vmx/nested.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 6401eb7ef19c..fe5814df5149 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -5468,28 +5468,36 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
struct vmcs12 *shadow_vmcs12 = get_shadow_vmcs12(vcpu);

if (kvm_state->size < sizeof(kvm_state) + 2 * sizeof(*vmcs12))
- return -EINVAL;
+ goto error_guest_mode_einval;

if (copy_from_user(shadow_vmcs12,
user_kvm_nested_state->data + VMCS12_SIZE,
- sizeof(*vmcs12)))
- return -EFAULT;
+ sizeof(*vmcs12))) {
+ ret = -EFAULT;
+ goto error_guest_mode;
+ }

if (shadow_vmcs12->hdr.revision_id != VMCS12_REVISION ||
!shadow_vmcs12->hdr.shadow_vmcs)
- return -EINVAL;
+ goto error_guest_mode_einval;
}

if (nested_vmx_check_vmentry_prereqs(vcpu, vmcs12) ||
nested_vmx_check_vmentry_postreqs(vcpu, vmcs12, &exit_qual))
- return -EINVAL;
+ goto error_guest_mode_einval;

vmx->nested.dirty_vmcs12 = true;
ret = nested_vmx_enter_non_root_mode(vcpu, false);
if (ret)
- return -EINVAL;
+ goto error_guest_mode_einval;

return 0;
+
+error_guest_mode_einval:
+ ret = -EINVAL;
+error_guest_mode:
+ vmx->nested.nested_run_pending = 0;
+ return ret;
}

void nested_vmx_vcpu_setup(void)