Re: [PATCH] x86/nmi: ratelimit unknown nmi logs

From: Peter Zijlstra
Date: Tue Feb 26 2019 - 06:54:58 EST


On Wed, Feb 20, 2019 at 10:00:28AM -0800, Olof Johansson wrote:
> On Wed, Feb 20, 2019 at 12:59 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Tue, Feb 19, 2019 at 05:48:36PM -0800, Olof Johansson wrote:
> > > Getting notified of unknown NMIs is obviously important, but getting
> > > notified on every single one, especially on larger systems with slow
> > > (serial) console causes more harm than good when it's a known noisy
> > > non-relevant event.
> > >
> > > So, let's ratelimit to avoid locking up the system.
> >
> > What kind of bonghit broken crap system is that?

Still interested to know what system and why this happens.

> > That is; this _really_ should not happen, and this is a bandaid, not
> > fixing the cause.
>
> Oh, I agree -- this shouldn't happen, and it's being debugged and fixed.
>
> So, I'm not looking at this as a bandaid to the real problem, but
> there's also no reason to DoS the system with prink when it does
> occur. If you want to configure the system to panic on unknown NMI
> there are already hooks for it.
>
> I'm obviously happy to carry local patches for this, since it's a
> temporary problem. But yet again, I don't see a reason to have the
> kernel run off the rails for this condition.

Fair enough I suppose. Personally I don't care either way; you could
just boot without the slow serial in order to install a new kernel.