Re: [RFC PATCH] perf/stat: Add --disable-hwdt
From: Ingo Molnar
Date: Mon Feb 06 2017 - 07:22:48 EST
* Borislav Petkov <bp@xxxxxxxxx> wrote:
> Hi guys,
>
> so I've been tracing recently on an AMD F15h which has those funky counter
> constraints and am seeing this:
>
> # ./perf stat sleep 1
>
> Performance counter stats for 'sleep 1':
>
> 0.749208 task-clock (msec) # 0.001 CPUs utilized
> 1 context-switches # 0.001 M/sec
> 0 cpu-migrations # 0.000 K/sec
> 54 page-faults # 0.072 M/sec
> 1,122,815 cycles # 1.499 GHz
> 286,740 stalled-cycles-frontend # 25.54% frontend cycles idle
> <not counted> stalled-cycles-backend (0.00%)
> ^^^^^^^^^^^^
> <not counted> instructions (0.00%)
> ^^^^^^^^^^^^
> <not counted> branches (0.00%)
> <not counted> branch-misses (0.00%)
>
> 1.001550070 seconds time elapsed
>
>
> The problem is that the HW watchdog thing is already taking up a
> counter so when perf stat uses the default counters and when we reach
> stalled-cycles-backend, we run out of counters for the remaining events.
>
> So how about something like this:
>
> # ./perf stat --disable-hwdt sleep 1
>
> Performance counter stats for 'sleep 1':
>
> 0.782552 task-clock (msec) # 0.001 CPUs utilized
> 1 context-switches # 0.001 M/sec
> 0 cpu-migrations # 0.000 K/sec
> 55 page-faults # 0.070 M/sec
> 1,163,246 cycles # 1.486 GHz
> 293,598 stalled-cycles-frontend # 25.24% frontend cycles idle
> 400,017 stalled-cycles-backend # 34.39% backend cycles idle
> 676,505 instructions # 0.58 insn per cycle
> # 0.59 stalled cycles per insn
> 133,822 branches # 171.007 M/sec
> 7,319 branch-misses # 5.47% of all branches
>
> 1.001660058 seconds time elapsed
>
> We did explore other opportunities on IRC like sharing counters or
> making the HW WDT thing a 'soft' counter but all those are nasty and
> probably not really worth the trouble of touching perf core just so that
> this works.
>
> Besides, future generations don't have those constraints anymore so it
> is only F15h.
>
> Below is a silly patch as a syntactic sugar helper for perf stat. This
> is just an RFC anyway, I'll do it properly with fopen() if you're ok
> with the approach.
Looks sensible, and I'd in fact make this the new default behavior (if root runs
perf stat) - i.e. add a flag to re-enable it, for the rare case where we want to
debug a hard deadlock while running perf stat ...
Thanks,
Ingo