Re: [PATCH]: mce: don't print "human readable" message forcorrected errors

From: Borislav Petkov
Date: Tue Apr 12 2011 - 16:29:04 EST


On Tue, Apr 12, 2011 at 04:15:38PM -0400, Prarit Bhargava wrote:
>
> > We are also setting TAINT_MACHINE_CHECK for corrected errors - perhaps
> > this made sense when systems were small and machine checks were rare and
> > scary. But I think we need to start working with the reality that
> > corrected errors are normal events.
> >
> >
>
> It still makes sense for small lt 1TB systems, IMO. But ... maybe a
> flag of some sort is necessary to stop the TAINTing for systems larger
> than that. The CEs may point to something going wrong on a system. CEs
> in theory become UCEs eventually, right?
>
> From a OS point of view, we would like to know that there is flaky HW on
> the system.
>
> /me knows this is going to turn into a PFA discussion in 4 ... 3 ... 2
> .... ;)

Yeah, there's the TAINT thing too, good point Tony. Well, we definitely
don't want to get tainted for correctable errors - they're too "normal"
to do so, IMHO.

I'm thinking remove the TAINT for CEs and don't call the default
notifier if it is the only notifier call registered. Maybe something like

if (num_notifiers(&x86_mce_decoder_chain) > 1)
atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, &m);

or since the notifiers are priority sorted, don't call notifiers with -1
prio.

Or something to that effect.

--
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/