Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup

From: Sean Christopherson
Date: Fri Oct 09 2020 - 12:11:38 EST


On Fri, Oct 09, 2020 at 06:48:21PM +0300, stsp wrote:
> 09.10.2020 18:30, Sean Christopherson пишет:
> >On Fri, Oct 09, 2020 at 05:11:51PM +0300, stsp wrote:
> >>09.10.2020 07:04, Sean Christopherson пишет:
> >>>>Hmm. But at least it was lying
> >>>>similarly on AMD and Intel CPUs. :)
> >>>>So I was able to reproduce the problems
> >>>>myself.
> >>>>Do you mean, any AMD tests are now useless, and we need to proceed with Intel
> >>>>tests only?
> >>>For anything VMXE related, yes.
> >>What would be the expected behaviour on Intel, if it is set? Any difference
> >>with AMD?
> >On Intel, userspace should be able to stuff CR4.VMXE=1 via KVM_SET_SREGS if
> >the 'nested' module param is 1, e.g. if 'modprobe kvm_intel nested=1'. Note,
> >'nested' is enabled by default on kernel 5.0 and later.
>
> So if I understand you correctly, we
> need to test that:
> - with nested=0 VMXE gives EINVAL
> - with nested=1 VMXE changes nothing
> visible, except probably to allow guest
> to read that value (we won't test guest
> reading though).
>
> Is this correct?

Yep, exactly!

> >With AMD, setting CR4.VMXE=1 is never allowed as AMD doesn't support VMX,
>
> OK, for that I can give you a
> Tested-by: Stas Sergeev <stsp@xxxxxxxxxxxxxxxxxxxxx>
>
> because I confirm that on AMD it now consistently returns EINVAL, whereas
> without your patches it did random crap, depending on whether it is a first
> call to KVM_SET_SREGS, or not first.
>
>
> >>But we do not use unrestricted guest.
> >>We use v86 under KVM.
> >Unrestricted guest can kick in even if CR0.PG=1 && CR0.PE=1, e.g. there are
> >segmentation checks that apply if and only if unrestricted_guest=0. Long story
> >short, without a deep audit, it's basically impossible to rule out a dependency
> >on unrestricted guest since you're playing around with v86.
>
> You mean "unrestricted_guest" as a module parameter, rather than the similar
> named CPU feature, right? So we may depend on unrestricted_guest parameter,
> but not on a hardware feature, correct?

The unrestricted_guest module param is tied directly to the hardware feature,
i.e. if kvm_intel.unrestricted_guest=0 then KVM will run guests with
unrestricted guest disabled. That doesn't necessarily mean any of the
behavior that is allowed by unrestricted guest will be encountered, but if
it is encountered, then it will be handled by the CPU instead of causing a
VM-Exit and requiring KVM emulation.

The reported is using an old CPU that doesn't support unrestricted guest,
so both the hardware feature and the module param will be off/0.

> >>The only other effect of setting VMXE was clearing VME. Which shouldn't
> >>affect anything either, right?
> >Hmm, clearing VME would mean that exceptions/interrupts within the guest would
> >trigger a switch out of v86 and into vanilla protected mode. v86 and PM have
> >different consistency checks, particularly for segmentation, so it's plausible
> >that clearing CR4.VME inadvertantly worked around the bug by avoiding invalid
> >guest state for v86.
>
> Lets assume that was the case. With those github guys its not possible to do
> any consistent checks. :(

K. If this is ever a problem in the future, having a way relatively simple
reproducer, e.g. something we can run without having to build/install a
variety of tools, would make it easier to debug. In theory, the bug should be
reproducible even on modern hardware by loading KVM with unrestricted_guest=0.