Re: [PATCH 2/2] perf: Don't throttle based on NMI watchdog events

From: Calvin Owens

Date: Tue Mar 31 2026 - 14:12:30 EST


On Tuesday 03/31 at 10:22 -0700, Calvin Owens wrote:
> On Tuesday 03/31 at 08:25 -0700, Calvin Owens wrote:
> > @@ -663,6 +666,17 @@ void perf_sample_event_took(u64 sample_len_ns)
> > if (avg_len <= max_len)
> > return;
> >
> > + /*
> > + * Very infrequent events like the perf counter hard watchdog
> > + * can trigger spurious throttling: skip throttling if the prior
> > + * NMI got here more than one second before this NMI began.
> > + */
> > + now = local_clock();
> > + delta = now - __this_cpu_read(last_throttle_clock);
> > + __this_cpu_write(last_throttle_clock, now);
> > + if (delta - sample_len_ns > NSEC_PER_SEC)
> > + return;
>
> Bah, Sashiko caught something obvious I missed:
>
> https://sashiko.dev/#/patchset/cover.1774969692.git.calvin%40wbinvd.org
>
> >> When the outer handler completes, its sample_len_ns (total execution
> >> time) will be strictly greater than delta (time since the inner
> >> handler finished). This guarantees delta < sample_len_ns, causing the
> >> subtraction to underflow to a massive positive value.
> >>
> >> The condition > NSEC_PER_SEC will then evaluate to true, and the outer
> >> handler will erroneously skip the perf throttling logic. Should this
> >> check be rewritten to avoid subtraction, perhaps by using if (delta >
> >> sample_len_ns + NSEC_PER_SEC)?
>
> The solution it proposed makes sense to me.

I replied too quickly: I think Sashiko is actually wrong.

It is assuming that sample_len_ns includes the latency of
perf_sample_event_took(), but it does not.

Nesting in the middle of the RMW of the percpu value strictly makes
last_throttle_clock appear to have happened *sooner* to the outer NMI,
so I think that case works.

Thanks, apologies again for all the noise here,
Calvin

> > __report_avg = avg_len;
> > __report_allowed = max_len;
> >
> > --
> > 2.47.3
> >