Re: [PATCH v3] x86/mce: Try printing all machine check banks known before panic
From: Borislav Petkov
Date: Fri Nov 21 2014 - 16:35:56 EST
On Fri, Nov 21, 2014 at 09:31:56PM +0000, Luck, Tony wrote:
> >
> > /*
> > * No machine check event found. Must be some external
> > * source or one CPU is hung. Panic.
> > */
> > if (global_worst <= MCE_KEEP_SEVERITY && mca_cfg.tolerant < 3)
> > mce_panic("Machine check from unknown source", NULL, NULL);
> >
> > Provided this comment is correct, it doesn't sound like any MCE record
> > will ever tell us what causes the error as an external source or a hung
> > CPU doesn't generate an MCE record in any bank, does it?
>
> That means there were no VALID=1, EN=1, S=1 errors anywhere. But there
> might be some other things logged that would help us understand.
By "other things" you mean other MCEs?
> We are into cpu errata territory here though ... we aren't supposed to get
> machine checks that don't have a logged cause. We panic for spurious
> machine checks because we know something has gone horribly wrong,
> even if we don't know what that something was.
Oh, cpu errata. So this would mean that we can't even rely on the
contents of the MCA banks, can we?
In any case, is any of the information in the MCA banks in such cases
even usable then? Because if not, we're definitely barking up the wrong
tree...
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/