Re: [PATCH] x86/mce: fix failed to reenable cmci when swiching to interrupt mode

From: Xie XiuQi
Date: Tue Aug 11 2015 - 22:07:37 EST


On 2015/8/11 22:46, Borislav Petkov wrote:
On Tue, Aug 11, 2015 at 06:09:37PM +0800, Xie XiuQi wrote:
Zhang Liguang report a bug as bellow:
1) system detected cmci storm on current cpu
2) disable cmci interrupt on banks ownd by current cpu, then swiching to poll mode
3) a few minites later, system swiching to interrupt mode on current cpu
4) we expect system to reenable cmci interrupt on banks ownd by current cpu
mce_intel_adjust_timer
|-> cmci_reenable
|-> cmci_discover # but, ownd banks is ignore here

static void cmci_discover(int banks)
...
for (i = 0; i < banks; i++) {
...
if (test_bit(i, owned)) # ownd banks is ignore here
continue;

In this patch, we add a func cmci_storm_enable_banks(), just to enable banks
which ownd by current cpu without clean the ownd flags. We call this func
instead of cmci_reenble() when swiching to interrupt mode.

Hmm, and we cannot clear the owned bit because those banks won't be
polled otherwise, see:

27f6c573e0f7 ("x86, CMCI: Add proper detection of end of CMCI storms")

OK, thanks.


Yuck.

Well, ok, but do it differently, please: rename
cmci_storm_disable_banks() to cmci_storm_switch_banks(bool on) which
turns them on and off. Unless Tony has a better suggestion...

Reported-by: Zhang Liguang <zhangliguang@xxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # v4.1+

Why 4.1 only?

My fault, it's v3.15+.

Thanks,
Xie XiuQi



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/