Re: [tip:x86/urgent] x86/mce: Fix CMCI preemption bugs

From: Borislav Petkov
Date: Thu Apr 17 2014 - 16:58:59 EST


On Thu, Apr 17, 2014 at 09:42:41PM +0200, Borislav Petkov wrote:
> On Thu, Apr 17, 2014 at 12:25:14PM -0700, Linus Torvalds wrote:
> > No, Owen tested a simpler patch that just changes the "get_cpu_var()"
> > to "__get_cpu_var()" and avoids the preempt increment.
>
> Which basically would be the same as doing this_cpu_write() in the
> proposed fix - both don't touch preemption. So it is something else.
> More staring...

Ok, in one of the mails Ingo forwarded to me, it said it still failed with

> kernel: [ 7.341085] BUG: using __this_cpu_write() in preemptible [00000000] code: modprobe/546

but considering Owen tried with a simpler __get_cpu_var version, I
fail to see how the __this_cpu_write() BUG will happen. Btw, those
__this_cpu_write things have received preemption checks. I'm seeing
right now another thread happening on lkml:

http://lkml.kernel.org/r/8761m7lm3j.fsf@xxxxxxxxxxxxx

So, Owen, can you please clarify which patch you *did* text exactly and
whether it worked or not.

Also, did you test the patch below? If not, please give it a run too.

Thanks.

---
This bug is introduced by me in commit 27f6c573e0. I forget
to execute put_cpu_var operation after get_cpu_var. Fix it
via this_cpu_write instead of get_cpu_var.

v2 -> v1: Separate cleanup from bug fix.

Signed-off-by: Chen, Gong <gong.chen@xxxxxxxxxxxxxxx>
Suggested-by: H. Peter Anvin <hpa@xxxxxxxxx>
---
arch/x86/kernel/cpu/mcheck/mce.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index eeee23f..68317c8 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -598,7 +598,6 @@ void machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
{
struct mce m;
int i;
- unsigned long *v;

this_cpu_inc(mce_poll_count);

@@ -618,8 +617,7 @@ void machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
if (!(m.status & MCI_STATUS_VAL))
continue;

- v = &get_cpu_var(mce_polled_error);
- set_bit(0, v);
+ this_cpu_write(mce_polled_error, 1);
/*
* Uncorrected or signalled events are handled by the exception
* handler when it is enabled, so don't process those here.
--
1.9.0


--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/