Re: [PATCH v2 10/13] KVM: nSVM: Restrict mapping VMCB12 on nested VMRUN

From: Sean Christopherson
Date: Tue Dec 09 2025 - 13:49:50 EST


On Tue, Dec 09, 2025, Yosry Ahmed wrote:
> On Tue, Dec 09, 2025 at 08:03:15AM -0800, Sean Christopherson wrote:
> > On Mon, Nov 10, 2025, Yosry Ahmed wrote:
> > > + nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
> > > + nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
> > > +
> > > + if (!nested_vmcb_check_save(vcpu) ||
> > > + !nested_vmcb_check_controls(vcpu)) {
> > > + vmcb12->control.exit_code = SVM_EXIT_ERR;
> > > + vmcb12->control.exit_code_hi = 0;
> > > + vmcb12->control.exit_info_1 = 0;
> > > + vmcb12->control.exit_info_2 = 0;
> > > + ret = -1;
> >
> > I don't love shoving the consistency checks in here. I get why you did it, but
> > it's very surprising to see (and/or easy to miss) these consistency checks. The
> > caller also ends up quite wonky:
> >
> > if (ret == -EINVAL) {
> > kvm_inject_gp(vcpu, 0);
> > return 1;
> > } else if (ret) {
> > return kvm_skip_emulated_instruction(vcpu);
> > }
> >
> > ret = kvm_skip_emulated_instruction(vcpu);
> >
> > Ha! And it's buggy. __kvm_vcpu_map() can return -EFAULT if creating a host
> > mapping fails. Eww, and blindly using '-1' as the "failed a consistency check"
> > is equally cross, as it relies on kvm_vcpu_map() not returning -EPERM in a very
> > weird way.
>
> I was trying to maintain the pre-existing behavior as much as possible,
> and I think the existing code will handle -EFAULT from kvm_vcpu_map() in
> the same way (skip the instruction and return).
>
> I guess I shouldn't have assumed maintaining the existing behavior is
> the right thing to do.

Maintaining existing behavior is absolutely the right thing to do when moving
code around. It's just that sometimes touching code uncovers pre-existing issues,
as is the case here.

> It's honestly really hard to detangle the return values of different KVM
> functions and what they mean. "return 1" here is not very meaningful,
> and the return code from kvm_skip_emulated_instruction() is not
> documented, so I don't really know what we're supposed to return here in
> what cases. The error code are usually not interpreted until a few
> layers higher up the callstack.

LOL, welcome to KVM x86. This has been a complaint since before I started working
on KVM. We're finally getting traction on that mess, but it's a _huge_ mess to
sort out.

https://lore.kernel.org/all/20251205074537.17072-1-jgross@xxxxxxxx