Re: [PATCH] KVM: SVM: Propagate Translation Cache Extensions to the guest

From: Sean Christopherson

Date: Fri Mar 06 2026 - 11:27:08 EST


On Fri, Mar 06, 2026, Yosry Ahmed wrote:
> From: Venkatesh Srinivas <venkateshs@xxxxxxxxxxxx>
>
> TCE augments the behavior of TLB invalidating instructions (INVLPG,
> INVLPGB, and INVPCID) to only invalidate translations for relevant
> intermediate mappings to the address range, rather than ALL intermdiate
> translations.
>
> The Linux kernel has been setting EFER.TCE if supported by the CPU since
> commit 440a65b7d25f ("x86/mm: Enable AMD translation cache extensions"),
> as it may improve performance.
>
> KVM does not need to do anything to virtualize the feature,

Please back this up with actual analysis.

> only advertise it and allow setting EFER.TCE. Passthrough X86_FEATURE_TCE to

Advertise X86_FEATURE_TCE to userspace, not "passthrough xxx to the guest".
Because that's all KVM

> the guest, and allow the guest to set EFER.TCE if available.
>
> Co-developed-by: Yosry Ahmed <yosry@xxxxxxxxxx>
> Signed-off-by: Yosry Ahmed <yosry@xxxxxxxxxx>
> Signed-off-by: Venkatesh Srinivas <venkateshs@xxxxxxxxxxxx>

Your SoB should come last to capture that the chain of hanlding, i.e. this should
be:

Signed-off-by: Venkatesh Srinivas <venkateshs@xxxxxxxxxxxx>
Co-developed-by: Yosry Ahmed <yosry@xxxxxxxxxx>
Signed-off-by: Yosry Ahmed <yosry@xxxxxxxxxx>

> ---
> arch/x86/kvm/cpuid.c | 1 +
> arch/x86/kvm/svm/svm.c | 3 +++
> arch/x86/kvm/x86.c | 3 +++
> 3 files changed, 7 insertions(+)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index fffbf087937d4..4f810f23b1d9b 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -1112,6 +1112,7 @@ void kvm_initialize_cpu_caps(void)
> F(XOP),
> /* SKINIT, WDT, LWP */
> F(FMA4),
> + F(TCE),
> F(TBM),
> F(TOPOEXT),
> VENDOR_F(PERFCTR_CORE),
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 3407deac90bd6..fee1c8cd45973 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -5580,6 +5580,9 @@ static __init int svm_hardware_setup(void)
> if (boot_cpu_has(X86_FEATURE_AUTOIBRS))
> kvm_enable_efer_bits(EFER_AUTOIBRS);
>
> + if (boot_cpu_has(X86_FEATURE_TCE))
> + kvm_enable_efer_bits(EFER_TCE);

Hrm, I think we should handle all of the kvm_enable_efer_bits() calls that are
conditioned only on CPU support in common code. While it's highly unlikely Intel
CPUs will ever support more EFER-based features, if they do, then KVM will
over-report support since kvm_initialize_cpu_caps() will effectively enable the
feature, but VMX won't enable the corresponding EFER bit.

I can't think anything that will go sideways if we rely purely on KVM caps, so
get to something like this as prep work, and then land TCE in common x86?

---
arch/x86/kvm/svm/svm.c | 7 +------
arch/x86/kvm/vmx/vmx.c | 4 ----
arch/x86/kvm/x86.c | 14 ++++++++++++++
3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3407deac90bd..c23ee45f2ba8 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5556,14 +5556,10 @@ static __init int svm_hardware_setup(void)
pr_err_ratelimited("NX (Execute Disable) not supported\n");
return -EOPNOTSUPP;
}
- kvm_enable_efer_bits(EFER_NX);

kvm_caps.supported_xcr0 &= ~(XFEATURE_MASK_BNDREGS |
XFEATURE_MASK_BNDCSR);

- if (boot_cpu_has(X86_FEATURE_FXSR_OPT))
- kvm_enable_efer_bits(EFER_FFXSR);
-
if (tsc_scaling) {
if (!boot_cpu_has(X86_FEATURE_TSCRATEMSR)) {
tsc_scaling = false;
@@ -5577,8 +5573,7 @@ static __init int svm_hardware_setup(void)

tsc_aux_uret_slot = kvm_add_user_return_msr(MSR_TSC_AUX);

- if (boot_cpu_has(X86_FEATURE_AUTOIBRS))
- kvm_enable_efer_bits(EFER_AUTOIBRS);
+

/* Check for pause filtering support */
if (!boot_cpu_has(X86_FEATURE_PAUSEFILTER)) {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9302c16571cd..2b8a7456039c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8583,10 +8583,6 @@ __init int vmx_hardware_setup(void)

vmx_setup_user_return_msrs();

-
- if (boot_cpu_has(X86_FEATURE_NX))
- kvm_enable_efer_bits(EFER_NX);
-
if (boot_cpu_has(X86_FEATURE_MPX)) {
rdmsrq(MSR_IA32_BNDCFGS, host_bndcfgs);
WARN_ONCE(host_bndcfgs, "BNDCFGS in host will be lost");
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 879cdeb6adde..0b5d48e75b65 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10025,6 +10025,18 @@ void kvm_setup_xss_caps(void)
}
EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_setup_xss_caps);

+static void kvm_setup_efer_caps(void)
+{
+ if (kvm_cpu_cap_has(X86_FEATURE_NX))
+ kvm_enable_efer_bits(EFER_NX);
+
+ if (kvm_cpu_cap_has(X86_FEATURE_FXSR_OPT))
+ kvm_enable_efer_bits(EFER_FFXSR);
+
+ if (kvm_cpu_cap_has(X86_FEATURE_AUTOIBRS))
+ kvm_enable_efer_bits(EFER_AUTOIBRS);
+}
+
static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
{
memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
@@ -10161,6 +10173,8 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
if (r != 0)
goto out_mmu_exit;

+ kvm_setup_efer_caps();
+
enable_device_posted_irqs &= enable_apicv &&
irq_remapping_cap(IRQ_POSTING_CAP);


base-commit: 5128b972fb2801ad9aca54d990a75611ab5283a9
--