Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour

From: Leo Yan
Date: Wed Dec 14 2022 - 10:58:22 EST


On Wed, Dec 14, 2022 at 11:41:55AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Dec 13, 2022 at 08:40:31AM -0800, Namhyung Kim escreveu:
> > Hi,
> >
> > On Tue, Dec 13, 2022 at 6:44 AM Arnaldo Carvalho de Melo
> > <acme@xxxxxxxxxx> wrote:
> > >
> > > Em Mon, Aug 01, 2022 at 09:20:37AM +0100, James Clark escreveu:
> > > >
> > > >
> > > > On 29/07/2022 17:12, Adrián Herrera Arcila wrote:
> > > > > The described --delay behaviour is to delay the enablement of events, but
> > > > > not the execution of the command, if one is passed, which is incorrectly
> > > > > the current behaviour.
> > > > >
> > > > > This patch decouples the enablement from the delay, and enables events
> > > > > before or after launching the workload dependent on the options passed
> > > > > by the user. This code structure is inspired by that in perf-record, and
> > > > > tries to be consistent with it.
> > > > >
> > > > > Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@xxxxxx
> > > > > Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters")
> > > > > Signed-off-by: Adrián Herrera Arcila <adrian.herrera@xxxxxxx>
> > > > > ---
> > > > > tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++-----------------
> > > > > 1 file changed, 32 insertions(+), 24 deletions(-)
> > > >
> > > > Looks good to me. Fixes the counter delay issue and the code is pretty
> > > > similar to perf record now. Although I would wait for Leo's or Song's
> > > > comment as well because they were involved.
> > >
> > > I think I didn't notice Leo's ack, it still applies, so I'm doing it
> > > now.
> >
> > I think the BPF counters should be enabled/disabled together.
>
> Ok, so I removed this one and applied Namhyung's.

I can guess why Adrián doesn't enable/disable BPF counters together :)

Since 'perf stat' doesn't enable BPF counters with other normal PMU
events in the first place, I believe this is deliberately by Song's
patch fa853c4b839e ("perf stat: Enable counting events for BPF
programs"), it says:

"'perf stat -b' creates per-cpu perf_event and loads fentry/fexit BPF
programs (monitor-progs) to the target BPF program (target-prog). The
monitor-progs read perf_event before and after the target-prog, and
aggregate the difference in a BPF map. Then the user space reads data
from these maps".

IIUC, when loading eBPF (counter) program, perf tool needs to handle
eBPF program map specially (so that perf tool can know the latest eBPF
program's map in kernel).

I don't know anything for eBPF counter, so this is why I am still a bit
puzzle which way is right to do (bind vs separate eBPF counters). But
I personally prefer to let eBPF counter to respect delay, so it's fine
for me to apply Namhyung's patch.

Thanks,
Leo