Re: [PATCH v2 2/2] perf: Don't throttle based on NMI watchdog events
From: Calvin Owens
Date: Fri May 01 2026 - 13:12:40 EST
On Wednesday 04/29 at 15:08 -0700, Calvin Owens wrote:
> On Wednesday 04/29 at 20:38 +0000, sashiko-bot@xxxxxxxxxx wrote:
> > > @@ -636,6 +639,19 @@ void perf_sample_event_took(u64 sample_len_ns)
> > > if (avg_len <= max_len)
> > > return;
> > >
> > > + /*
> > > + * Very infrequent events like the perf counter hard watchdog
> > > + * can trigger spurious throttling: skip throttling if the prior
> > > + * NMI got here more than one second before this NMI began. But
> > > + * never skip throttling if NMIs are nesting, or if any NMI runs
> > > + * for longer than one second.
> > > + */
> > > + now = local_clock();
> > > + last = __this_cpu_read(last_throttle_clock);
> > > + if (__this_cpu_cmpxchg(last_throttle_clock, last, now) == last &&
> >
> > On 32-bit architectures, is it safe to use __this_cpu_cmpxchg() on a 64-bit
> > value in NMI context?
> >
> > For architectures like 32-bit x86, the generic fallback implementation
> > achieves atomicity by disabling interrupts, which does not mask NMIs. If a
> > nested NMI interrupts the sequence halfway through reading or writing the
> > 32-bit halves of the u64, could this result in a torn read or write?
> >
> > A corrupted timestamp could cause a massive wrap-around in the time gap
> > calculation, perpetually satisfying the > NSEC_PER_SEC bypass condition and
> > silently disabling PMU throttling for all events on that CPU.
> >
> > > + now - last > NSEC_PER_SEC && sample_len_ns < NSEC_PER_SEC)
>
> If this is a problem, isn't it also a problem for the 64-bit store after
> updating the EWMA just above this?
>
> I guess last_throttle_clock could be a u32 and use the low clock bits,
> that's sufficient with the one second limit... but I would appreciate a
> real human opinion :)
Andi, since you had looked at the original patch, do you have any
thoughts about this?
I think the LLM is splitting hairs and wasting all of our time: I don't
want to waste more of my and everyone else's time on the autofeedback
unless a real human being agrees it matters. I feel like I've already
been far too generous to it TBH...
This dynamic throttling only exists to prevent users from shooting
themselves in the foot and losing the machine. All the edge cases the
LLM has described only transiently delay throttling when they occur,
and I don't think that matters at all.
Thanks,
Calvin