Re: [PATCH v2 6/6] x86/mce: Dynamically register default MCE handler
From: Luck, Tony
Date: Fri Jan 03 2020 - 16:43:05 EST
On Fri, Jan 03, 2020 at 04:07:22PM +0100, Jan H. Schönherr wrote:
> The default MCE handler takes action, when no external MCE handler is
> registered. Instead of checking for external handlers within the default
> MCE handler, only register the default MCE handler when there are no
> external handlers in the first place.
> Signed-off-by: Jan H. Schönherr <jschoenh@xxxxxxxxx>
> New in v2. This is something that became possible due to patch 4.
> I'm not entirely happy with it.
> One the one hand, I'm wondering whether there's a nicer way to handle
> (de-)registration races.
Instead of unregistering/registering the default notifier depending
on whether there are other notifiers, couldn't you just make the
default notifier check to see if it should print. E.g.
static int mce_default_notifier(struct notifier_block *nb, unsigned long val,
struct mce *m = (struct mce *)data;
if (m && !atomic_read(&num_notifiers))
> On the other hand, I'm starting to question the whole logic to "only print
> the MCE if nothing else in the kernel has a handler registered". Is that
> really how it should be? For example, there are handlers that filter for a
> specific subset of MCEs. If one of those is registered, we're losing all
> information for MCEs that don't match.
Maybe put control of this into the hands of the notifiers ... if
a notifier thinks that it does something useful with the log
that makes the console log redundant, then it could call a function
to bump the count to suppress the __print_mce(). Simple filter functions
on the chain wouldn't do that.
If we go this path the variable should be named something like
"suppress_console_mce" rather than num_notifiers.
Might also be useful to have some command line option or debugfs knob
to force printing for those cases where bad stuff is happening and we
don't see what was logged before a crash drops all the bits on the floor.