Re: [PATCH v6 bpf-next 1/3] perf: enable branch record for software events

From: Song Liu
Date: Fri Sep 10 2021 - 14:27:42 EST




> On Sep 10, 2021, at 6:54 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Fri, Sep 10, 2021 at 12:40:51PM +0200, Peter Zijlstra wrote:
>
>> The below seems to cure that.
>
> Seems I lost a hunk, fold below.
>
> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
> index 9e6d6eaeb4cb..6b72e9b55c69 100644
> --- a/arch/x86/events/intel/lbr.c
> +++ b/arch/x86/events/intel/lbr.c
> @@ -228,20 +228,6 @@ static void __intel_pmu_lbr_enable(bool pmi)
> wrmsrl(MSR_ARCH_LBR_CTL, lbr_select | ARCH_LBR_CTL_LBREN);
> }
>
> -static void __intel_pmu_lbr_disable(void)
> -{
> - u64 debugctl;
> -
> - if (static_cpu_has(X86_FEATURE_ARCH_LBR)) {
> - wrmsrl(MSR_ARCH_LBR_CTL, 0);
> - return;
> - }
> -
> - rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
> - debugctl &= ~(DEBUGCTLMSR_LBR | DEBUGCTLMSR_FREEZE_LBRS_ON_PMI);
> - wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
> -}
> -
> void intel_pmu_lbr_reset_32(void)
> {
> int i;
> @@ -779,8 +765,12 @@ void intel_pmu_lbr_disable_all(void)
> {
> struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>
> - if (cpuc->lbr_users && !vlbr_exclude_host())
> + if (cpuc->lbr_users && !vlbr_exclude_host()) {
> + if (static_cpu_has(X86_FEATURE_ARCH_LBR))
> + return __intel_pmu_arch_lbr_disable();
> +
> __intel_pmu_lbr_disable();
> + }
> }
>
> void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)

This works great and saves 3 entries! We have the following now:

ID: 0 from bpf_get_branch_snapshot+18 to intel_pmu_snapshot_branch_stack+0
ID: 1 from __brk_limit+477143934 to bpf_get_branch_snapshot+0
ID: 2 from __brk_limit+477192263 to __brk_limit+477143880 # trampoline
ID: 3 from __bpf_prog_enter+34 to __brk_limit+477192251
ID: 4 from migrate_disable+60 to __bpf_prog_enter+9
ID: 5 from __bpf_prog_enter+4 to migrate_disable+0
ID: 6 from bpf_testmod_loop_test+20 to __bpf_prog_enter+0
ID: 7 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13
ID: 8 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13

I will fold this in and send v7.

Thanks,
Song