Re: 64bit x86: NMI nesting still buggy?

From: Ingo Molnar
Date: Tue May 06 2014 - 06:02:27 EST



* Jiri Kosina <jkosina@xxxxxxx> wrote:

> On Tue, 29 Apr 2014, Steven Rostedt wrote:
>
> > > According to 38.4 of [1], when SMM mode is entered while the CPU is
> > > handling NMI, the end result might be that upon exit from SMM, NMIs will
> > > be re-enabled and latched NMI delivered as nested [2].
> >
> > Note, if this were true, then the x86_64 hardware would be extremely
> > buggy. That's because NMIs are not made to be nested. If SMM's come in
> > during an NMI and re-enables the NMI, then *all* software would break.
> > That would basically make NMIs useless.
> >
> > The only time I've ever witness problems (and I stress NMIs all the
> > time), is when the NMI itself does a fault. Which my patch set handles
> > properly.
>
> Yes, it indeed does.
>
> In the scenario I have outlined, the race window is extremely small,
> plus NMIs don't happen that often, plus SMIs don't happen that
> often, plus (hopefully) many BIOSes don't enable NMIs upon SMM exit.

Note, the "NMIs don't happen that often" condition is pretty rare on
x86 Linux systems. These days anyone doing a 'perf top', 'perf record'
or running a profiling tool like SysProf will generate tens of
thousands of NMIs, per second. Systems with profiling active are
literally bathed in NMIs, and that is how we found the page fault NMI
bug.

So I'd say any race condition hypothesis assuming "NMIs are rare" is
probably invalid on modern Linux systems.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/