Re: [PATCH v3 4/5] KVM/x86/vPMU: Add APIs to support host save/restore the guest lbr stack
From: Peter Zijlstra
Date: Thu Sep 20 2018 - 08:32:40 EST
On Thu, Sep 20, 2018 at 06:05:58PM +0800, Wei Wang wrote:
> diff --git a/arch/x86/kvm/pmu_intel.c b/arch/x86/kvm/pmu_intel.c
> index 5ab4a36..97a29d7 100644
> --- a/arch/x86/kvm/pmu_intel.c
> +++ b/arch/x86/kvm/pmu_intel.c
> @@ -342,6 +342,47 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu)
> pmu->global_ovf_ctrl = 0;
> }
>
> +int intel_pmu_enable_save_guest_lbr(struct kvm_vcpu *vcpu)
> +{
> + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
> + struct perf_event *event;
> + struct perf_event_attr attr = {
> + .type = PERF_TYPE_RAW,
> + .size = sizeof(attr),
Bit yuck to imply .config = 0, like that.
> + .pinned = true,
Note that this can still result in failing to schedule the event, you
create a pinned task event, but a pinned cpu event takes precedence and
can claim the LBR.
How do you deal with that situation, where the LBR is in use by a host
event?
> + .exclude_host = true,
Didn't you mean to use .exclude_guest ? You don't want this thing to run
when the guest is running, right?
But I still don't like this hack much, what happens if userspace sets
that bit? You seems to have forgotten to combine it with PF_VCPU or
whatever is the appropriate bit to check.
Similarly, how does the existing exclude_{gues,host} crud work?
> + .sample_type = PERF_SAMPLE_BRANCH_STACK,
What's the point of setting .sample_type if !.sample_period ?
> + .branch_sample_type = PERF_SAMPLE_BRANCH_CALL_STACK |
> + PERF_SAMPLE_BRANCH_USER |
> + PERF_SAMPLE_BRANCH_KERNEL,
> + };
I think this function wants a comment to explain wth its doing and why,
because if someone stumbles over this code in a few months nobody will
remember those things.
> +
> + if (pmu->guest_lbr_event)
> + return 0;
> +
> + event = perf_event_create_kernel_counter(&attr, -1, current, NULL,
> + NULL);
> + if (IS_ERR(event)) {
> + pr_err("%s: failed %ld\n", __func__, PTR_ERR(event));
> + return -ENOENT;
> + }
> + pmu->guest_lbr_event = event;
> +
> + return 0;
> +}