On 02/01/2018 02:25 PM, David Woodhouse wrote:
On Wed, 2018-01-31 at 23:26 -0500, Konrad Rzeszutek Wilk wrote:
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6a9f4ec..bfc80ff 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -594,6 +594,14 @@ struct vcpu_vmx {
ÂÂ #endif
ÂÂÂÂÂÂÂÂu64ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ arch_capabilities;
+ÂÂÂÂÂu64ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ spec_ctrl;
+
+ÂÂÂÂÂ/*
+ÂÂÂÂÂ * This indicates that:
+ÂÂÂÂÂ * 1) guest_cpuid_has(X86_FEATURE_IBRS) == true &&
+ÂÂÂÂÂ * 2) The guest has actually initiated a write against the MSR.
+ÂÂÂÂÂ */
+ÂÂÂÂÂbool spec_ctrl_used;
ÂÂÂÂÂÂÂÂ/*
ÂÂÂÂÂÂÂÂ * This indicates that:
Thanks for persisting with the details here, Karim. In addition to
Konrad's heckling at the comments, I'll add my own request to his...
I'd like the comment for spec_ctrl_used to explain why it isn't
entirely redundant with the spec_ctrl_intercepted() function.
Without nesting, I believe it *would* be redundant, but the difference
comes when an L2 is running for which L1 has not permitted the MSR to
be passed through. That's when we have spec_ctrl_used = true but the
MSR *isn't* actually passed through in the active msr_bitmap.
Question: if spec_ctrl_used is always equivalent to the intercept bit
in the vmcs01.msr_bitmap, just not the guest bitmap... should we ditch
it and always use the bit from the vmcs01.msr_bitmap?
If I used the vmcs01.msr_bitmap, spec_ctrl_used will always be true if
L0 passed it to L1. Even if L1 did not actually pass it to L2 and even
if L2 has not written to it yet (!used).
This pretty much renders the short-circuit at
nested_vmx_merge_msr_bitmap useless:
ÂÂÂÂÂÂÂ if (!nested_cpu_has_virt_x2apic_mode(vmcs12) &&
ÂÂÂÂÂÂÂÂÂÂÂ !to_vmx(vcpu)->pred_cmd_used &&
ÂÂÂÂÂÂÂÂÂÂÂ !to_vmx(vcpu)->spec_ctrl_used)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return false;
... and the default path will be kvm_vcpu_gpa_to_page + kmap.
That being said, I have to admit the logic for spec_ctrl_used is not
perfect either.
If L1 or any of the L2s touched the MSR, spec_ctrl_used will be set to
true. So if one L2 used the MSR, all other L2s will also skip the short-
circuit mentioned above and end up *always* going through
kvm_vcpu_gpa_to_page + kmap.
Maybe all of this is over-thinking and in reality the short-circuit
above is really useless and all L2 guests are happily using x2apic :)
Amazon Development Center Germany GmbH
Sorry :)