Re: [PATCH v2] perf stat: Disable NMI watchdog message on hybrid

From: Jiri Olsa
Date: Wed Jun 09 2021 - 16:26:29 EST


On Wed, Jun 09, 2021 at 01:06:00PM +0800, Jin Yao wrote:
> If we run a single workload that only runs on big core, there is always a
> ugly message about disabling the NMI watchdog because the atom is not
> counted.
>
> Before:
>
> # ./perf stat true
>
> Performance counter stats for 'true':
>
> 0.43 msec task-clock # 0.396 CPUs utilized
> 0 context-switches # 0.000 /sec
> 0 cpu-migrations # 0.000 /sec
> 45 page-faults # 103.918 K/sec
> 639,634 cpu_core/cycles/ # 1.477 G/sec
> <not counted> cpu_atom/cycles/ (0.00%)
> 643,498 cpu_core/instructions/ # 1.486 G/sec
> <not counted> cpu_atom/instructions/ (0.00%)
> 123,715 cpu_core/branches/ # 285.694 M/sec
> <not counted> cpu_atom/branches/ (0.00%)
> 4,094 cpu_core/branch-misses/ # 9.454 M/sec
> <not counted> cpu_atom/branch-misses/ (0.00%)
>
> 0.001092407 seconds time elapsed
>
> 0.001144000 seconds user
> 0.000000000 seconds sys
>
> Some events weren't counted. Try disabling the NMI watchdog:
> echo 0 > /proc/sys/kernel/nmi_watchdog
> perf stat ...
> echo 1 > /proc/sys/kernel/nmi_watchdog
>
> # ./perf stat -e '{cpu_atom/cycles/,msr/tsc/}' true
>
> Performance counter stats for 'true':
>
> <not counted> cpu_atom/cycles/ (0.00%)
> <not counted> msr/tsc/ (0.00%)
>
> 0.001904106 seconds time elapsed
>
> 0.001947000 seconds user
> 0.000000000 seconds sys
>
> Some events weren't counted. Try disabling the NMI watchdog:
> echo 0 > /proc/sys/kernel/nmi_watchdog
> perf stat ...
> echo 1 > /proc/sys/kernel/nmi_watchdog
> The events in group usually have to be from the same PMU. Try reorganizing the group.
>
> Now we disable the NMI watchdog message on hybrid, otherwise there
> are too many false positives.
>
> After:
>
> # ./perf stat true
>
> Performance counter stats for 'true':
>
> 0.79 msec task-clock # 0.419 CPUs utilized
> 0 context-switches # 0.000 /sec
> 0 cpu-migrations # 0.000 /sec
> 48 page-faults # 60.889 K/sec
> 777,692 cpu_core/cycles/ # 986.519 M/sec
> <not counted> cpu_atom/cycles/ (0.00%)
> 669,147 cpu_core/instructions/ # 848.828 M/sec
> <not counted> cpu_atom/instructions/ (0.00%)
> 128,635 cpu_core/branches/ # 163.176 M/sec
> <not counted> cpu_atom/branches/ (0.00%)
> 4,089 cpu_core/branch-misses/ # 5.187 M/sec
> <not counted> cpu_atom/branch-misses/ (0.00%)
>
> 0.001880649 seconds time elapsed
>
> 0.001935000 seconds user
> 0.000000000 seconds sys
>
> # ./perf stat -e '{cpu_atom/cycles/,msr/tsc/}' true
>
> Performance counter stats for 'true':
>
> <not counted> cpu_atom/cycles/ (0.00%)
> <not counted> msr/tsc/ (0.00%)
>
> 0.000963319 seconds time elapsed
>
> 0.000999000 seconds user
> 0.000000000 seconds sys
>
> Signed-off-by: Jin Yao <yao.jin@xxxxxxxxxxxxxxx>
> ---
> v2:
> - If the group was mixed with hybrid event and non-hybrid event,
> the NMI watchdog message was still reported. V2 adds checking
> for hybrid event mixed group.
>
> v1:
> - Get ACK from Jiri.
>
> tools/perf/util/stat-display.c | 22 +++++++++++++++++++---
> 1 file changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index b759dfd633b4..c1314f13bc9a 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -404,6 +404,19 @@ static bool is_mixed_hw_group(struct evsel *counter)
> return false;
> }
>
> +static bool is_mixed_hybrid_group(struct evsel *counter)
> +{
> + struct evlist *evlist = counter->evlist;
> + struct evsel *pos;
> +
> + evlist__for_each_entry(evlist, pos) {
> + if (perf_pmu__is_hybrid(pos->pmu_name))
> + return true;
> + }

so we care if there's at least one hybrid event in the list right?
it can be all full with just hybrid events, but the function name
suggests it's mixed with normal events

jirka

> +
> + return false;
> +}
> +
> static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int nr,
> struct evsel *counter, double uval,
> char *prefix, u64 run, u64 ena, double noise,
> @@ -465,9 +478,12 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
> config->csv_sep);
>
> if (counter->supported) {
> - config->print_free_counters_hint = 1;
> - if (is_mixed_hw_group(counter))
> - config->print_mixed_hw_group_error = 1;
> + if (!is_mixed_hybrid_group(counter)) {
> + if (!perf_pmu__is_hybrid(counter->pmu_name))
> + config->print_free_counters_hint = 1;
> + if (is_mixed_hw_group(counter))
> + config->print_mixed_hw_group_error = 1;
> + }
> }
>
> fprintf(config->output, "%-*s%s",
> --
> 2.17.1
>