Re: [PATCH 2/6] KVM: nSVM: Always recalculate LBR MSR intercepts in svm_update_lbrv()
From: Yosry Ahmed
Date: Tue Nov 11 2025 - 13:55:37 EST
On Tue, Nov 11, 2025 at 03:11:37AM +0000, Yosry Ahmed wrote:
> On Sat, Nov 08, 2025 at 12:45:20AM +0000, Yosry Ahmed wrote:
> > svm_update_lbrv() is called when MSR_IA32_DEBUGCTLMSR is updated, and on
> > nested transitions where LBRV is used. It checks whether LBRV enablement
> > needs to be changed in the current VMCB, and if it does, it also
> > recalculate intercepts to LBR MSRs.
> >
> > However, there are cases where intercepts need to be updated even when
> > LBRV enablement doesn't. Example scenario:
> > - L1 has MSR_IA32_DEBUGCTLMSR cleared.
> > - L1 runs L2 without LBR_CTL_ENABLE (no LBRV).
> > - L2 sets DEBUGCTLMSR_LBR in MSR_IA32_DEBUGCTLMSR, svm_update_lbrv()
> > sets LBR_CTL_ENABLE in VMCB02 and disables intercepts to LBR MSRs.
> > - L2 exits to L1, svm_update_lbrv() is not called on this transition.
> > - L1 clears MSR_IA32_DEBUGCTLMSR, svm_update_lbrv() finds that
> > LBR_CTL_ENABLE is already cleared in VMCB01 and does nothing.
> > - Intercepts remain disabled, L1 reads to LBR MSRs read the host MSRs.
> >
> > Fix it by always recalculating intercepts in svm_update_lbrv().
>
> This actually breaks hyperv_svm_test, because svm_update_lbrv() is
> called on every nested transition, calling
> svm_recalc_lbr_msr_intercepts() -> svm_set_intercept_for_msr() and
> setting svm->nested.force_msr_bitmap_recalc to true.
>
> This breaks the hyperv optimization in nested_svm_vmrun_msrpm() AFAICT.
>
> I think there are two ways to fix this:
> - Add another bool to svm->nested to track LBR intercepts, and only call
> svm_set_intercept_for_msr() if the intercepts need to be updated.
>
> - Update svm_set_intercept_for_msr() itself to do nothing if the
> intercepts do not need to be changed, which is more clutter but
> applies to other callers as well so could shave cycles elsewhere (see
> below).
>
> Sean, Paolo, any preferences?
>
> Here's what updating svm_set_intercept_for_msr() looks like:
and that diff breaks userspace_msr_exit_test :)
Here's an actually tested diff:
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 2fbb0b88c6a3e..88717429ba9d5 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -664,24 +664,38 @@ void svm_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type, bool se
{
struct vcpu_svm *svm = to_svm(vcpu);
void *msrpm = svm->msrpm;
+ bool recalc = false;
+ bool already_set;
/* Don't disable interception for MSRs userspace wants to handle. */
if (type & MSR_TYPE_R) {
- if (!set && kvm_msr_allowed(vcpu, msr, KVM_MSR_FILTER_READ))
+ set = set || !kvm_msr_allowed(vcpu, msr, KVM_MSR_FILTER_READ);
+ already_set = svm_test_msr_bitmap_read(msrpm, msr);
+
+ if (!set && already_set)
svm_clear_msr_bitmap_read(msrpm, msr);
- else
+ else if (set && !already_set)
svm_set_msr_bitmap_read(msrpm, msr);
+
+ recalc |= (set != already_set);
}
if (type & MSR_TYPE_W) {
- if (!set && kvm_msr_allowed(vcpu, msr, KVM_MSR_FILTER_WRITE))
+ set = set || !kvm_msr_allowed(vcpu, msr, KVM_MSR_FILTER_WRITE);
+ already_set = svm_test_msr_bitmap_write(msrpm, msr);
+
+ if (!set && already_set)
svm_clear_msr_bitmap_write(msrpm, msr);
- else
+ else if (set && !already_set)
svm_set_msr_bitmap_write(msrpm, msr);
+
+ recalc |= (set != already_set);
}
- svm_hv_vmcb_dirty_nested_enlightenments(vcpu);
- svm->nested.force_msr_bitmap_recalc = true;
+ if (recalc) {
+ svm_hv_vmcb_dirty_nested_enlightenments(vcpu);
+ svm->nested.force_msr_bitmap_recalc = true;
+ }
}
void *svm_alloc_permissions_map(unsigned long size, gfp_t gfp_mask)
---
For the record, I don't want to just use svm_test_msr_bitmap_*() in
svm_update_lbrv() because there is no direct equivalent in older kernels
as far as I can tell, so a backport would be completely inapplicable.