Re: [PATCH] x86/mce: fix failed to reenable cmci when swiching to interrupt mode

From: Borislav Petkov
Date: Tue Aug 11 2015 - 10:46:39 EST


On Tue, Aug 11, 2015 at 06:09:37PM +0800, Xie XiuQi wrote:
> Zhang Liguang report a bug as bellow:
> 1) system detected cmci storm on current cpu
> 2) disable cmci interrupt on banks ownd by current cpu, then swiching to poll mode
> 3) a few minites later, system swiching to interrupt mode on current cpu
> 4) we expect system to reenable cmci interrupt on banks ownd by current cpu
> mce_intel_adjust_timer
> |-> cmci_reenable
> |-> cmci_discover # but, ownd banks is ignore here
>
> > static void cmci_discover(int banks)
> > ...
> > for (i = 0; i < banks; i++) {
> > ...
> > if (test_bit(i, owned)) # ownd banks is ignore here
> > continue;
>
> In this patch, we add a func cmci_storm_enable_banks(), just to enable banks
> which ownd by current cpu without clean the ownd flags. We call this func
> instead of cmci_reenble() when swiching to interrupt mode.

Hmm, and we cannot clear the owned bit because those banks won't be
polled otherwise, see:

27f6c573e0f7 ("x86, CMCI: Add proper detection of end of CMCI storms")

Yuck.

Well, ok, but do it differently, please: rename
cmci_storm_disable_banks() to cmci_storm_switch_banks(bool on) which
turns them on and off. Unless Tony has a better suggestion...

> Reported-by: Zhang Liguang <zhangliguang@xxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx # v4.1+

Why 4.1 only?

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/