Re: [RFC 0/2] perf/core: Invoke pmu::sched_task callback for cpu events
From: Stephane Eranian
Date: Thu Nov 05 2020 - 03:30:08 EST
On Mon, Nov 2, 2020 at 6:52 AM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> Hello,
>
> It was reported that system-wide events with precise_ip set have a lot
> of unknown symbols on Intel machines. Depending on the system load I
> can see more than 30% of total symbols are not resolved (actually
> don't have DSO mappings).
>
> I found that it's only large PEBS is enabled - using call-graph or the
> frequency mode will disable it and have valid results. I've verified
> it by checking intel_pmu_pebs_sched_task() is called like below:
>
> # perf probe -a intel_pmu_pebs_sched_task
>
> # perf stat -a -e probe:intel_pmu_pebs_sched_task \
> > perf record -a -e cycles:ppp -c 100001 sleep 1
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 2.625 MB perf.data (10345 samples) ]
>
> Performance counter stats for 'system wide':
>
> 0 probe:intel_pmu_pebs_sched_task
>
> 2.157533991 seconds time elapsed
>
>
> Looking at the code, I found out that the pmu::sched_task callback was
> changed recently that it's called only for task events. So cpu events
> with large PEBS didn't flush the buffer and they are attributed to
> unrelated tasks later resulted in unresolved symbols.
>
> This patch reverts it and keeps the optimization for task events.
> While at it, I also found the context switch callback was not enabled
> for cpu events from the beginning. So I've added it too. With this
> applied, I can see the above callbacks are hit as expected and perf
> report has valid symbols.
>
This is a serious bug that impacts many kernel versions as soon as
multi-entry PEBS is activated by the kernel in system-wide mode.
I remember this was working in the past so it must have been broken by
some code refactoring or optimization or extension of sched_task
to other features. PEBS must be flushed on context switch in per-cpu
mode, otherwise you may report samples in locations that do not belong
to the process where they are processed in. PEBS does not tag samples
with PID/TID.