On Fri, Aug 04, 2023, Chao Gao wrote:I see, will change it, thank you!
On Thu, Aug 03, 2023 at 12:27:24AM -0400, Yang Weijiang wrote:Actually, we can do even better. We have an existing framework for these types
Add emulation interface for CET MSR read and write.This if-else-if suggests that they are focibly grouped together to just
The emulation code is split into common part and vendor specific
part, the former resides in x86.c to benefic different x86 CPU
vendors, the latter for VMX is implemented in this patch.
Signed-off-by: Yang Weijiang <weijiang.yang@xxxxxxxxx>
---
arch/x86/kvm/vmx/vmx.c | 27 +++++++++++
arch/x86/kvm/x86.c | 104 +++++++++++++++++++++++++++++++++++++----
arch/x86/kvm/x86.h | 18 +++++++
3 files changed, 141 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6aa76124e81e..ccf750e79608 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2095,6 +2095,18 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
else
msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
break;
+ case MSR_IA32_S_CET:
+ case MSR_KVM_GUEST_SSP:
+ case MSR_IA32_INT_SSP_TAB:
+ if (kvm_get_msr_common(vcpu, msr_info))
+ return 1;
+ if (msr_info->index == MSR_KVM_GUEST_SSP)
+ msr_info->data = vmcs_readl(GUEST_SSP);
+ else if (msr_info->index == MSR_IA32_S_CET)
+ msr_info->data = vmcs_readl(GUEST_S_CET);
+ else if (msr_info->index == MSR_IA32_INT_SSP_TAB)
+ msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
share the call of kvm_get_msr_common(). For readability, I think it is better
to handle them separately.
e.g.,
case MSR_IA32_S_CET:
if (kvm_get_msr_common(vcpu, msr_info))
return 1;
msr_info->data = vmcs_readl(GUEST_S_CET);
break;
case MSR_KVM_GUEST_SSP:
if (kvm_get_msr_common(vcpu, msr_info))
return 1;
msr_info->data = vmcs_readl(GUEST_SSP);
break;
of prechecks, I just completely forgot about it :-( (my "look at PAT" was a bad
suggestion).
Handle the checks in __kvm_set_msr() and __kvm_get_msr(), i.e. *before* calling
into vendor code. Then vendor code doesn't need to make weird callbacks.
OK.Please use a single namespace for these #defines, e.g. CET_CTRL_* or maybeint kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
{
u32 msr = msr_info->index;
@@ -3981,6 +4014,45 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
vcpu->arch.guest_fpu.xfd_err = data;
break;
#endif
+#define CET_EXCLUSIVE_BITS (CET_SUPPRESS | CET_WAIT_ENDBR)
+#define CET_CTRL_RESERVED_BITS GENMASK(9, 6)
CET_US_* for everything.
Sure :-)Bah, stupid SDM. Please spell out "LEGACY", I though "LEG" was short for "LEGAL"+#define CET_SHSTK_MASK_BITS GENMASK(1, 0)
+#define CET_IBT_MASK_BITS (GENMASK_ULL(5, 2) | \
+ GENMASK_ULL(63, 10))
+#define CET_LEG_BITMAP_BASE(data) ((data) >> 12)
since this looks a lot like a page shift, i.e. getting a pfn.
I saw Paolo shares different opinion on this, so would hold on for a while...Similar to my suggestsion for XSS, I think we drop the waiver for host_initiated+static bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu,
+ struct msr_data *msr)
+{
+ if (is_shadow_stack_msr(msr->index)) {
+ if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
+ return false;
+
+ if (msr->index == MSR_KVM_GUEST_SSP)
+ return msr->host_initiated;
+
+ return msr->host_initiated ||
+ guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
+ }
+
+ if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
+ !kvm_cpu_cap_has(X86_FEATURE_IBT))
+ return false;
+
+ return msr->host_initiated ||
+ guest_cpuid_has(vcpu, X86_FEATURE_IBT) ||
+ guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
accesses, i.e. require the feature to be enabled and exposed to the guest, even
for the host.
OK, will do it, thanks!diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.hI think it is better to describe what the function does prior to jumping into
index c69fc027f5ec..3b79d6db2f83 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -552,4 +552,22 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size,
unsigned int port, void *data, unsigned int count,
int in);
+/*
+ * Guest xstate MSRs have been loaded in __msr_io(), disable preemption before
+ * access the MSRs to avoid MSR content corruption.
+ */
details like where guest FPU is loaded.
OK, maybe I added the helpers in this patch duo to compilation error "function is defined but not used"./*
* Lock and/or reload guest FPU and access xstate MSRs. For accesses initiated
* by host, guest FPU is loaded in __msr_io(). For accesses initiated by guest,
* guest FPU should have been loaded already.
*/
+static inline void kvm_get_xsave_msr(struct msr_data *msr_info)Can you rename functions to kvm_get/set_xstate_msr() to align with the comment
+{
+ kvm_fpu_get();
+ rdmsrl(msr_info->index, msr_info->data);
+ kvm_fpu_put();
+}
+
+static inline void kvm_set_xsave_msr(struct msr_data *msr_info)
+{
+ kvm_fpu_get();
+ wrmsrl(msr_info->index, msr_info->data);
+ kvm_fpu_put();
+}
and patch 6? And if there is no user outside x86.c, you can just put these two
functions right after the is_xstate_msr() added in patch 6.
+1. These should also assert that (a) guest FPU state is loaded andDo you mean something like this:
(b) the MSRI'm OK to add the assert if finally all the CET MSRs are passed through directly.
is passed through to the guest. I might be ok dropping (b) if both VMX and SVM
passthrough all MSRs if they're exposed to the guest, i.e. not lazily passed
through.
Sans any changes to kvm_{g,s}et_xsave_msr(), I think this? (completely untested)The code looks good to me except the handling of MSR_KVM_GUEST_SSP,
---
arch/x86/kvm/vmx/vmx.c | 34 +++-------
arch/x86/kvm/x86.c | 151 +++++++++++++++--------------------------
2 files changed, 64 insertions(+), 121 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 491039aeb61b..1211eb469d06 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2100,16 +2100,13 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
break;
case MSR_IA32_S_CET:
+ msr_info->data = vmcs_readl(GUEST_S_CET);
+ break;
case MSR_KVM_GUEST_SSP:
+ msr_info->data = vmcs_readl(GUEST_SSP);
+ break;
case MSR_IA32_INT_SSP_TAB:
- if (kvm_get_msr_common(vcpu, msr_info))
- return 1;
- if (msr_info->index == MSR_KVM_GUEST_SSP)
- msr_info->data = vmcs_readl(GUEST_SSP);
- else if (msr_info->index == MSR_IA32_S_CET)
- msr_info->data = vmcs_readl(GUEST_S_CET);
- else if (msr_info->index == MSR_IA32_INT_SSP_TAB)
- msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
+ msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
break;
case MSR_IA32_DEBUGCTLMSR:
msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
@@ -2432,25 +2429,14 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
else
vmx->pt_desc.guest.addr_a[index / 2] = data;
break;
- case MSR_IA32_PL0_SSP ... MSR_IA32_PL2_SSP:
- if (kvm_set_msr_common(vcpu, msr_info))
- return 1;
- if (data) {
- vmx_disable_write_intercept_sss_msr(vcpu);
- wrmsrl(msr_index, data);
- }
- break;
case MSR_IA32_S_CET:
+ vmcs_writel(GUEST_S_CET, data);
+ break;
case MSR_KVM_GUEST_SSP:
+ vmcs_writel(GUEST_SSP, data);
+ break;
case MSR_IA32_INT_SSP_TAB:
- if (kvm_set_msr_common(vcpu, msr_info))
- return 1;
- if (msr_index == MSR_KVM_GUEST_SSP)
- vmcs_writel(GUEST_SSP, data);
- else if (msr_index == MSR_IA32_S_CET)
- vmcs_writel(GUEST_S_CET, data);
- else if (msr_index == MSR_IA32_INT_SSP_TAB)
- vmcs_writel(GUEST_INTR_SSP_TABLE, data);
+ vmcs_writel(GUEST_INTR_SSP_TABLE, data);
break;
case MSR_IA32_PERF_CAPABILITIES:
if (data && !vcpu_to_pmu(vcpu)->version)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7385fc25a987..75e6de7c9268 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1838,6 +1838,11 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type)
}
EXPORT_SYMBOL_GPL(kvm_msr_allowed);
+#define CET_US_RESERVED_BITS GENMASK(9, 6)
+#define CET_US_SHSTK_MASK_BITS GENMASK(1, 0)
+#define CET_US_IBT_MASK_BITS (GENMASK_ULL(5, 2) | GENMASK_ULL(63, 10))
+#define CET_US_LEGACY_BITMAP_BASE(data) ((data) >> 12)
+
/*
* Write @data into the MSR specified by @index. Select MSR specific fault
* checks are bypassed if @host_initiated is %true.
@@ -1897,6 +1902,35 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data,
data = (u32)data;
break;
+ case MSR_IA32_U_CET:
+ case MSR_IA32_S_CET:
+ if (!guest_can_use(vcpu, X86_FEATURE_SHSTK) &&
+ !guest_can_use(vcpu, X86_FEATURE_IBT))
+ return 1;
+ if (data & CET_US_RESERVED_BITS)
+ return 1;
+ if (!guest_can_use(vcpu, X86_FEATURE_SHSTK) &&
+ (data & CET_US_SHSTK_MASK_BITS))
+ return 1;
+ if (!guest_can_use(vcpu, X86_FEATURE_IBT) &&
+ (data & CET_US_IBT_MASK_BITS))
+ return 1;
+ if (!IS_ALIGNED(CET_US_LEGACY_BITMAP_BASE(data), 4))
+ return 1;
+
+ /* IBT can be suppressed iff the TRACKER isn't WAIT_ENDR. */
+ if ((data & CET_SUPPRESS) && (data & CET_WAIT_ENDBR))
+ return 1;
+ break;
+ case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB:
+ case MSR_KVM_GUEST_SSP:
+ if (!guest_can_use(vcpu, X86_FEATURE_SHSTK))
+ return 1;
+ if (is_noncanonical_address(data, vcpu))
+ return 1;
+ if (!IS_ALIGNED(data, 4))
+ return 1;
+ break;
}
msr.data = data;
@@ -1940,6 +1974,17 @@ static int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data,
!guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
return 1;
break;
+ case MSR_IA32_U_CET:
+ case MSR_IA32_S_CET:
+ if (!guest_can_use(vcpu, X86_FEATURE_IBT) &&
+ !guest_can_use(vcpu, X86_FEATURE_SHSTK))
+ return 1;
+ break;
+ case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB:
+ case MSR_KVM_GUEST_SSP:
+ if (!guest_can_use(vcpu, X86_FEATURE_SHSTK))
+ return 1;
+ break;
}
msr.index = index;
@@ -3640,47 +3685,6 @@ static bool kvm_is_msr_to_save(u32 msr_index)
return false;
}
-static inline bool is_shadow_stack_msr(u32 msr)
-{
- return msr == MSR_IA32_PL0_SSP ||
- msr == MSR_IA32_PL1_SSP ||
- msr == MSR_IA32_PL2_SSP ||
- msr == MSR_IA32_PL3_SSP ||
- msr == MSR_IA32_INT_SSP_TAB ||
- msr == MSR_KVM_GUEST_SSP;
-}
-
-static bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu,
- struct msr_data *msr)
-{
- if (is_shadow_stack_msr(msr->index)) {
- if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
- return false;
-
- /*
- * This MSR is synthesized mainly for userspace access during
- * Live Migration, it also can be accessed in SMM mode by VMM.
- * Guest is not allowed to access this MSR.
- */
- if (msr->index == MSR_KVM_GUEST_SSP) {
- if (IS_ENABLED(CONFIG_X86_64) && is_smm(vcpu))
- return true;
-
- return msr->host_initiated;
- }
-
- return msr->host_initiated ||
- guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
- }
-
- if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
- !kvm_cpu_cap_has(X86_FEATURE_IBT))
- return false;
-
- return msr->host_initiated ||
- guest_cpuid_has(vcpu, X86_FEATURE_IBT) ||
- guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
-}
int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
{
@@ -4036,46 +4040,9 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
vcpu->arch.guest_fpu.xfd_err = data;
break;
#endif
-#define CET_EXCLUSIVE_BITS (CET_SUPPRESS | CET_WAIT_ENDBR)
-#define CET_CTRL_RESERVED_BITS GENMASK(9, 6)
-#define CET_SHSTK_MASK_BITS GENMASK(1, 0)
-#define CET_IBT_MASK_BITS (GENMASK_ULL(5, 2) | \
- GENMASK_ULL(63, 10))
-#define CET_LEG_BITMAP_BASE(data) ((data) >> 12)
case MSR_IA32_U_CET:
- case MSR_IA32_S_CET:
- if (!kvm_cet_is_msr_accessible(vcpu, msr_info))
- return 1;
- if (!!(data & CET_CTRL_RESERVED_BITS))
- return 1;
- if (!guest_can_use(vcpu, X86_FEATURE_SHSTK) &&
- (data & CET_SHSTK_MASK_BITS))
- return 1;
- if (!guest_can_use(vcpu, X86_FEATURE_IBT) &&
- (data & CET_IBT_MASK_BITS))
- return 1;
- if (!IS_ALIGNED(CET_LEG_BITMAP_BASE(data), 4) ||
- (data & CET_EXCLUSIVE_BITS) == CET_EXCLUSIVE_BITS)
- return 1;
- if (msr == MSR_IA32_U_CET)
- kvm_set_xsave_msr(msr_info);
- break;
- case MSR_KVM_GUEST_SSP:
- case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB:
- if (!kvm_cet_is_msr_accessible(vcpu, msr_info))
- return 1;
- if (is_noncanonical_address(data, vcpu))
- return 1;
- if (!IS_ALIGNED(data, 4))
- return 1;
- if (msr == MSR_IA32_PL0_SSP || msr == MSR_IA32_PL1_SSP ||
- msr == MSR_IA32_PL2_SSP) {
- vcpu->arch.cet_s_ssp[msr - MSR_IA32_PL0_SSP] = data;
- if (!vcpu->arch.cet_sss_active && data)
- vcpu->arch.cet_sss_active = true;
- } else if (msr == MSR_IA32_PL3_SSP) {
- kvm_set_xsave_msr(msr_info);
- }
+ case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
+ kvm_set_xsave_msr(msr_info);
break;
default:
if (kvm_pmu_is_valid_msr(vcpu, msr))
@@ -4436,17 +4403,8 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
break;
#endif
case MSR_IA32_U_CET:
- case MSR_IA32_S_CET:
- case MSR_KVM_GUEST_SSP:
- case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB:
- if (!kvm_cet_is_msr_accessible(vcpu, msr_info))
- return 1;
- if (msr == MSR_IA32_PL0_SSP || msr == MSR_IA32_PL1_SSP ||
- msr == MSR_IA32_PL2_SSP) {
- msr_info->data = vcpu->arch.cet_s_ssp[msr - MSR_IA32_PL0_SSP];
- } else if (msr == MSR_IA32_U_CET || msr == MSR_IA32_PL3_SSP) {
- kvm_get_xsave_msr(msr_info);
- }
+ case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
+ kvm_get_xsave_msr(msr_info);
break;
default:
if (kvm_pmu_is_valid_msr(vcpu, msr))
@@ -7330,9 +7288,13 @@ static void kvm_probe_msr_to_save(u32 msr_index)
break;
case MSR_IA32_U_CET:
case MSR_IA32_S_CET:
+ if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
+ !kvm_cpu_cap_has(X86_FEATURE_IBT))
+ return;
+ break;
case MSR_KVM_GUEST_SSP:
case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB:
- if (!kvm_is_cet_supported())
+ if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
return;
break;
default:
@@ -9664,13 +9626,8 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
kvm_caps.supported_xcr0 = host_xcr0 & KVM_SUPPORTED_XCR0;
}
if (boot_cpu_has(X86_FEATURE_XSAVES)) {
- u32 eax, ebx, ecx, edx;
-
- cpuid_count(0xd, 1, &eax, &ebx, &ecx, &edx);
rdmsrl(MSR_IA32_XSS, host_xss);
kvm_caps.supported_xss = host_xss & KVM_SUPPORTED_XSS;
- if (ecx & XFEATURE_MASK_CET_KERNEL)
- kvm_caps.supported_xss |= XFEATURE_MASK_CET_KERNEL;
}
rdmsrl_safe(MSR_EFER, &host_efer);
base-commit: efb9177acd7a4df5883b844e1ec9c69ef0899c9c