Re: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

From: Luck, Tony
Date: Thu May 16 2019 - 11:54:05 EST


On Tue, Apr 30, 2019 at 08:32:20PM +0000, Ghannam, Yazen wrote:
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 986de830f26e..551366c155ef 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -1567,10 +1567,13 @@ static void __mcheck_cpu_init_clear_banks(void)
> for (i = 0; i < this_cpu_read(mce_num_banks); i++) {
> struct mce_bank *b = &mce_banks[i];
>
> - if (!b->init)
> - continue;
> - wrmsrl(msr_ops.ctl(i), b->ctl);
> - wrmsrl(msr_ops.status(i), 0);
> + if (b->init) {
> + wrmsrl(msr_ops.ctl(i), b->ctl);
> + wrmsrl(msr_ops.status(i), 0);
> + }
> +
> + /* Save bits set in hardware. */
> + rdmsrl(msr_ops.ctl(i), b->ctl);
> }
> }

This looks like it will be a problem for Intel CPUs. If
we take a CPU offline, and then bring it back again, we
ues "b->ctl" to reinitialize the register in mce_reenable_cpu().

But Intel SDM says at the end of section "15.3.2.1 IA32_MCi_CTL_MSRs"

"P6 family processors only allow the writing of all 1s or all
0s to the IA32_MCi_CTL MSR."

-Tony