Re: [PATCH 2/4] perf/x86/intel: Don't context switch DS_AREA (and PEBS config) if PEBS is unused
From: Jim Mattson
Date: Tue Apr 14 2026 - 17:32:22 EST
On Tue, Apr 14, 2026 at 12:14 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> When filling the list of MSRs to be loaded by KVM on VM-Enter and VM-Exit,
> insert DS_AREA and (conditionally) MSR_PEBS_DATA_CFG into the list if and
> only if PEBS will be active in the guest, i.e. only if a PEBS record may be
> generated while running the guest. As shown by the !x86_pmu.pebs_ept path,
> it's perfectly safe to run with the host's DS_AREA, so long as PEBS-enabled
> counters are disabled via PERF_GLOBAL_CTRL.
>
> Omitting DS_AREA and MSR_PEBS_DATA_CFG when PEBS is unused saves two MSR
> writes per MSR on each VMX transition, i.e. eliminates two/four pointless
> MSR writes on each VMX roundtrip when PEBS isn't being used by the guest.
>
> Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
> Cc: Jim Mattson <jmattson@xxxxxxxxxx>
> Cc: Mingwei Zhang <mizhang@xxxxxxxxxx>
> Cc: Stephane Eranian <eranian@xxxxxxxxxx>
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> arch/x86/events/intel/core.c | 41 ++++++++++++++++++++++++------------
> 1 file changed, 27 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 002d809f82ef..20a153aa33cb 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -5037,23 +5037,14 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
> return arr;
> }
>
> + /*
> + * If the guest won't use PEBS or the CPU doesn't support PEBS in the
> + * guest, then there's nothing more to do as disabling PMCs via
> + * PERF_GLOBAL_CTRL is sufficient on CPUs with guest/host isolation.
> + */
> if (!kvm_pmu || !x86_pmu.pebs_ept)
> return arr;
>
> - arr[(*nr)++] = (struct perf_guest_switch_msr){
> - .msr = MSR_IA32_DS_AREA,
> - .host = (unsigned long)cpuc->ds,
> - .guest = kvm_pmu->ds_area,
> - };
> -
> - if (x86_pmu.intel_cap.pebs_baseline) {
> - arr[(*nr)++] = (struct perf_guest_switch_msr){
> - .msr = MSR_PEBS_DATA_CFG,
> - .host = cpuc->active_pebs_data_cfg,
> - .guest = kvm_pmu->pebs_data_cfg,
> - };
> - }
> -
> /*
> * Disable counters where the guest PMC is different than the host PMC
> * being used on behalf of the guest, as the PEBS record includes
> @@ -5065,6 +5056,28 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
> if (pebs_mask & ~cpuc->intel_ctrl_guest_mask)
> guest_pebs_mask = 0;
>
> + /*
> + * Context switch DS_AREA and PEBS_DATA_CFG if and only if PEBS will be
> + * active in the guest; if no records will be generated while the guest
> + * is running, then running with host values is safe (see above).
> + */
> + if (!guest_pebs_mask)
> + return arr;
I think this has an unintended side effect. If DS_AREA and
PEBS_DATA_CFG were previously listed (because pebs_guest_mask was
previously non-zero), KVM will leave the stale entries in the MSR-load
lists.
KVM only clears MSR-load list entries for enumerated MSRs with
matching guest and host values.
> + arr[(*nr)++] = (struct perf_guest_switch_msr){
> + .msr = MSR_IA32_DS_AREA,
> + .host = (unsigned long)cpuc->ds,
> + .guest = kvm_pmu->ds_area,
> + };
> +
> + if (x86_pmu.intel_cap.pebs_baseline) {
> + arr[(*nr)++] = (struct perf_guest_switch_msr){
> + .msr = MSR_PEBS_DATA_CFG,
> + .host = cpuc->active_pebs_data_cfg,
> + .guest = kvm_pmu->pebs_data_cfg,
> + };
> + }
> +
> /*
> * Do NOT mess with PEBS_ENABLED. As above, disabling counters via
> * PERF_GLOBAL_CTRL is sufficient, and loading a stale PEBS_ENABLED,
> --
> 2.54.0.rc0.605.g598a273b03-goog
>