Re: [PATCH v2] perf stat: Fix false NMI watchdog warning in aggregation modes
From: Ian Rogers
Date: Thu Jun 11 2026 - 18:33:59 EST
On Thu, Jun 11, 2026 at 2:56 PM Chun-Tse Shao <ctshao@xxxxxxxxxx> wrote:
>
> In aggregation modes (e.g. --per-socket, --per-die, etc.), a
> counter might not be scheduled or counted on specific aggregate
> groups if it was not assigned to the CPUs belonging to those
> groups. However, the printout() check triggers the
> "print_free_counters_hint" logic unconditionally for any
> supported counter with a missing count. This results in a false
> "Some events weren't counted. Try disabling the NMI watchdog"
> warning.
>
> Furthermore, the NMI watchdog only reserves performance counters
> on core PMUs. Uncore PMU events (e.g. CHA, IMC) are not affected
> by the NMI watchdog, but their failures also falsely triggered
> this warning.
>
> This warning was originally introduced in commit 02d492e5dcb7
> ("perf stat: Issue a HW watchdog disable hint").
>
> To fix this, restrict setting of print_free_counters_hint to
> only trigger for core PMU events by checking counter->pmu and
> counter->pmu->is_core.
>
> Example before/after:
>
> $ perf stat -M lpm_miss_lat --metric-only --per-socket -a -- sleep 1
>
> Before:
> Performance counter stats for 'system wide':
>
> ns lpm_miss_lat_rem ns lpm_miss_lat_loc
> S0 126 202.3 207.9
> S1 126 231.9 259.3
>
> 1.006029831 seconds time elapsed
>
> Some events weren't counted. Try disabling the NMI watchdog:
> echo 0 > /proc/sys/kernel/nmi_watchdog
> perf stat ...
> echo 1 > /proc/sys/kernel/nmi_watchdog
>
> After:
> Performance counter stats for 'system wide':
>
> ns lpm_miss_lat_rem ns lpm_miss_lat_loc
> S0 126 202.3 207.9
> S1 126 231.9 259.3
>
> 1.006029831 seconds time elapsed
>
> Assisted-by: Gemini:gemini-next
> Signed-off-by: Chun-Tse Shao <ctshao@xxxxxxxxxx>
Reviewed-by: Ian Rogers <irogers@xxxxxxxxxx>
The suggestion for something like:
```
pmu = evsel__find_pmu(counter);
if (pmu && pmu->is_core)
```
isn't really necessary because we nearly always set the PMU. Also, a
case where we lack a pmu has historically been for a core PMU, making
the whole thing contradictory. The patch as-is is clear.
Thanks,
Ian
> ---
> tools/perf/util/stat-display.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index 2b69d238858c..0a5750bb59fa 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -821,9 +821,9 @@ static void printout(struct perf_stat_config *config, struct outstate *os,
> ok = false;
>
> if (counter->supported) {
> - if (!evlist__has_hybrid_pmus(counter->evlist)) {
> + if (!evlist__has_hybrid_pmus(counter->evlist) &&
> + counter->pmu && counter->pmu->is_core)
> config->print_free_counters_hint = 1;
> - }
> }
> }
>
> --
> 2.54.0.1136.gdb2ca164c4-goog
>