Re: [RFC] #MC mess
From: Borislav Petkov
Date: Tue Feb 18 2020 - 14:50:42 EST
On Tue, Feb 18, 2020 at 01:11:58PM -0500, Steven Rostedt wrote:
> What's the issue with tracing? Does this affect the tracing done by the
> edac_mc_handle_error code?
>
> It has a trace event in it, that the rasdaemon uses.
Nah, that code is called from process context.
The problem with tracing the #MC handler is the same as tracing the NMI
handler. And the NMI handler does all kinds of dancing wrt breakpoints
and nested NMIs and the #MC handler doesn't do any of that. Not sure if
it should at all, btw.
> I believe static_key_disable() sleeps, and does all kinds of crazing
> things (like update the code).
True story, thanks for that hint!
static_key_disable()
|-> cpus_read_lock()
|-> percpu_down_read(&cpu_hotplug_lock)
|->might_sleep()
Yuck. Which means, the #MC handler must switch to __rdmsr()/__wrmsr()
now.
I wish I could travel back in time and NAK the hell of that MSR
tracepoint crap.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette