Re: [PATCH 9/9] KVM: x86: Disable KVM_INTEL_PROVE_VE by default

From: Sean Christopherson
Date: Tue May 21 2024 - 20:29:37 EST


On Tue, May 21, 2024, Paolo Bonzini wrote:
> On Tue, May 21, 2024 at 8:18 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > > - This should never be enabled in a production environment.
> > > + Note that #VE trapping appears to be buggy on some CPUs.
> >
> > I see where you're coming from, but I don't think "trapping" is much better,
> > e.g. it suggests there's something broken with the interception of #VEs Ah,
> > the entire help text is weird.
>
> Yeah, I didn't want to say #VE is broken altogether -

Ah, yeah, good call. The #VE isn't broken per se, just spurious/unexpected.

> interception is where we saw issues,

It's not an issue with interception, disabling #VE intercepts results in the #VE
being delivered to the guest.

Test suite: ept_access_test_not_present
PTE[4] @ 109fff8 = 9fed0007
PTE[3] @ 9fed0ff0 = 9fed1007
PTE[2] @ 9fed1000 = 9fed2007
VA PTE @ 9fed2000 = 8000000007
Created EPT @ 9feca008 = 11d2007
Created EPT @ 11d2000 = 11d3007
Created EPT @ 11d3000 = 11d4007
L1 hva = 40000000, hpa = 40000000, L2 gva = ffffffff80000000, gpa = 8000000000
Unhandled exception 8 #DF at ip 0000000000410d39
error_code=0000 rflags=00010097 cs=00000008
rax=ffffffff80000000 rcx=0000000000000000 rdx=0000000000000000 rbx=0000000000000000
rbp=000000009fec6fe0 rsi=0000000000000000 rdi=0000000000000000
r8=0000000000000000 r9=0000000000000000 r10=0000000000000000 r11=0000000000000000
r12=ffffffff80000008 r13=0000000000000000 r14=0000000000000000 r15=0000000000000000
cr0=0000000080010031 cr2=0000000000000000 cr3=000000000109f000 cr4=0000000000002020
cr8=0000000000000000
STACK: @410d39 40144a 4002dd

> and #VE is used in production as far as I know (not just by TDX; at least Xen
> and maybe Hyper-V use it for anti-malware purposes?).

Hmm, maybe a spurious #VE is benign? Or it really is limited to A/D bits being
disabled? Not that us speculating is going to change anything :-)

> Maybe "Note: there appear to be bugs in some CPUs that will trigger
> the WARN, in particular with eptad=0 and/or nested virtualization"
> covers all bases.

Works for me. Maybe tweak it slightly to explain why the WARN is triggered?

Note, some CPUs appear to generate spurious EPT Violations #VEs that trigger
KVM's WARN, in particular with eptad=0 and/or nested virtualization.