Re: [PATCH v5 04/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_DEPTH for guest Arch LBR

From: Jim Mattson
Date: Fri Jul 09 2021 - 16:35:50 EST


On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@xxxxxxxxx> wrote:
>
> From: Like Xu <like.xu@xxxxxxxxxxxxxxx>
>
> The number of Arch LBR entries available is determined by the value
> in host MSR_ARCH_LBR_DEPTH.DEPTH. The supported LBR depth values are
> enumerated in CPUID.(EAX=01CH, ECX=0):EAX[7:0]. For each bit "n" set
> in this field, the MSR_ARCH_LBR_DEPTH.DEPTH value of "8*(n+1)" is
> supported.
>
> On a guest write to MSR_ARCH_LBR_DEPTH, all LBR entries are reset to 0.
> KVM emulates the reset behavior by introducing lbr_desc->arch_lbr_reset.
> KVM writes guest requested value to the native ARCH_LBR_DEPTH MSR
> (this is safe because the two values will be the same) when the Arch LBR
> records MSRs are pass-through to the guest.
>
> Signed-off-by: Like Xu <like.xu@xxxxxxxxxxxxxxx>
> Signed-off-by: Yang Weijiang <weijiang.yang@xxxxxxxxx>
> ---

> @@ -393,6 +417,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> {
> struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
> struct kvm_pmc *pmc;
> + struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
> u32 msr = msr_info->index;
> u64 data = msr_info->data;
>
> @@ -427,6 +452,12 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> return 0;
> }
> break;
> + case MSR_ARCH_LBR_DEPTH:
> + if (!arch_lbr_depth_is_valid(vcpu, data))
> + return 1;

Does this imply that, when restoring a vCPU, KVM_SET_CPUID2 must be
called before KVM_SET_MSRS, so that arch_lbr_depth_is_valid() knows
what to do? Is this documented anywhere?

> + lbr_desc->records.nr = data;
> + lbr_desc->arch_lbr_reset = true;

Doesn't this make it impossible to restore vCPU state, since the LBRs
will be reset on the next VM-entry? At the very least, you probably
shouldn't set arch_lbr_reset when the MSR write is host-initiated.

However, there is another problem: arch_lbr_reset isn't serialized
anywhere. If you fix the host-initiated issue, then you still have a
problem if the last guest instruction prior to suspending the vCPU was
a write to IA32_LBR_DEPTH. If there is no subsequent VM-entry prior to
saving the vCPU state, then the LBRs will be saved/restored as part of
the guest XSAVE state, and they will not get cleared on resuming the
vCPU.

> + return 0;
> default:
> if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
> (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
> @@ -566,6 +597,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
> lbr_desc->records.nr = 0;
> lbr_desc->event = NULL;
> lbr_desc->msr_passthrough = false;
> + lbr_desc->arch_lbr_reset = false;

I'm not sure this is entirely correct. If the last guest instruction
prior to a warm reset was a write to IA32_LBR_DEPTH, then the LBRs
should be cleared (and arch_lbr_reset will be true). However, if you
clear that flag here, the LBRs will never get cleared.

> }
>