On Thu, May 11, 2023, Yang Weijiang wrote:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.cPlease put as much MSR handling in x86.c as possible. We quite obviously know
index c872a5aafa50..0ccaa467d7d3 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2093,6 +2093,12 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
else
msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
break;
+ case MSR_IA32_U_CET:
+ case MSR_IA32_PL3_SSP:
+ if (!kvm_cet_is_msr_accessible(vcpu, msr_info))
+ return 1;
+ kvm_get_xsave_msr(msr_info);
+ break;
that AMD support is coming along, there's no reason to duplicate all of this code.
And unless I'm missing something, John's series misses several #GP checks, e.g.
for MSR_IA32_S_CET reserved bits, which means that providing a common implementation
would actually fix bugs.
Got it, will refer to the PAT handling.
For MSRs that require vendor input and/or handling, please follow what was
recently done for MSR_IA32_CR_PAT, where the common bits are handled in common
code, and vendor code does its updates.
The divergent alignment between AMD and Intel could get annoying, but I'm sure
we can figure out a solution.
case MSR_IA32_DEBUGCTLMSR:Please #define reserved bits, ideally using the inverse of the valid masks. And
msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
break;
@@ -2405,6 +2411,18 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
else
vmx->pt_desc.guest.addr_a[index / 2] = data;
break;
+ case MSR_IA32_U_CET:
+ case MSR_IA32_PL3_SSP:
+ if (!kvm_cet_is_msr_accessible(vcpu, msr_info))
+ return 1;
+ if (is_noncanonical_address(data, vcpu))
+ return 1;
+ if (msr_index == MSR_IA32_U_CET && (data & GENMASK(9, 6)))
+ return 1;
+ if (msr_index == MSR_IA32_PL3_SSP && (data & GENMASK(2, 0)))
for SSP, it might be better to do IS_ALIGNED(data, 8) (or 4, pending my question
about the SDM's wording).
Side topic, what on earth does the SDM mean by this?!?
The linear address written must be aligned to 8 bytes and bits 2:0 must be 0
(hardware requires bits 1:0 to be 0).
I know Intel retroactively changed the alignment requirements, but the above
is nonsensical. If ucode prevents writing bits 2:0, who cares what hardware
requires?
+ return 1;This feels wrong. KVM should differentiate between SHSTK and IBT in the host.
+ kvm_set_xsave_msr(msr_info);
+ break;
case MSR_IA32_PERF_CAPABILITIES:
if (data && !vcpu_to_pmu(vcpu)->version)
return 1;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b6eec9143129..2e3a39c9297c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -13630,6 +13630,26 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size,
}
EXPORT_SYMBOL_GPL(kvm_sev_es_string_io);
+bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr)
+{
+ if (!kvm_cet_user_supported())
E.g. if running in a VM with SHSTK but not IBT, or vice versa, KVM should allow
writes to non-existent MSRs.
I.e. this looks wrong:
/*
* If SHSTK and IBT are available in KVM, clear CET user bit in
* kvm_caps.supported_xss so that kvm_cet_user_supported() returns
* false when called.
*/
if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
!kvm_cpu_cap_has(X86_FEATURE_IBT))
kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_USER;
and by extension, all dependent code is also wrong. IIRC, there's a virtualization
hole, but I don't see any reason why KVM has to make the hole even bigger.
+ return false;I probably asked this long ago, but if I did I since forgot. Is it really just
+
+ if (msr->host_initiated)
+ return true;
+
+ if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
+ !guest_cpuid_has(vcpu, X86_FEATURE_IBT))
+ return false;
+
+ if (msr->index == MSR_IA32_PL3_SSP &&
+ !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
PL3_SSP that depends on SHSTK? I would expect all shadow stack MSRs to depend
on SHSTK.
@@ -546,5 +557,25 @@ int kvm_sev_es_mmio_read(struct kvm_vcpu *vcpu, gpa_t src, unsigned int bytes,Please avoid pronouns
int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size,
unsigned int port, void *data, unsigned int count,
int in);
+bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr);
+
+/*
+ * We've already loaded guest MSRs in __msr_io() after check the MSR index.
+ * In case vcpu has been preempted, we need to disable preemption, checkvCPU. And this doesn't make any sense. The "vCPU" being preempted doesn't matter,
it's KVM, i.e. the task that's accessing vCPU state that cares about preemption.
I *think* what you're trying to say is that preemption needs to be disabled to
ensure that the guest values are resident.
+ * and reload the guest fpu states before read/write xsaves-managed MSRs.KVM already has helpers that do exactly this, and they have far better names for
+ */
+static inline void kvm_get_xsave_msr(struct msr_data *msr_info)
+{
+ fpregs_lock_and_load();
KVM: kvm_fpu_get() and kvm_fpu_put(). Can you convert kvm_fpu_get() to
fpregs_lock_and_load() and use those isntead? And if the extra consistency checks
in fpregs_lock_and_load() fire, we definitely want to know, as it means we probably
have bugs in KVM.