RE: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

From: Ghannam, Yazen
Date: Fri May 17 2019 - 11:48:30 EST


> -----Original Message-----
> From: linux-edac-owner@xxxxxxxxxxxxxxx <linux-edac-owner@xxxxxxxxxxxxxxx> On Behalf Of Borislav Petkov
> Sent: Friday, May 17, 2019 5:10 AM
> To: Luck, Tony <tony.luck@xxxxxxxxx>
> Cc: Ghannam, Yazen <Yazen.Ghannam@xxxxxxx>; linux-edac@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; x86@xxxxxxxxxx
> Subject: Re: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware
>
>
> On Thu, May 16, 2019 at 01:59:43PM -0700, Luck, Tony wrote:
> > I think the intent of the original patch was to find out
> > which bits are "implemented in hardware". I.e. throw all
> > 1's at the register and see if any of them stick.
>
> And, in addition, check ->init before showing/setting a bank:
>
> ---
> @@ -2095,6 +2098,9 @@ static ssize_t show_bank(struct device *s, struct device_attribute *attr,
>
> b = &per_cpu(mce_banks_array, s->id)[bank];
>
> + if (!b->init)
> + return -ENODEV;
> +
> return sprintf(buf, "%llx\n", b->ctl);
> }
>
> @@ -2113,6 +2119,9 @@ static ssize_t set_bank(struct device *s, struct device_attribute *attr,
>
> b = &per_cpu(mce_banks_array, s->id)[bank];
>
> + if (!b->init)
> + return -ENODEV;
> +
> b->ctl = new;
> mce_restart();
> ---
>
> so that you get a feedback whether the setting has even succeeded or
> not. Right now we're doing "something" blindly and accepting any b->ctl
> from userspace. Yeah, it is root-only but still...
>
> > I don't object to the idea behind the patch. But if you want
> > to do this you just should not modify b->ctl.
> >
> > So something like:
> >
> >
> > static void __mcheck_cpu_init_clear_banks(void)
> > {
> > struct mce_bank *mce_banks = this_cpu_read(mce_banks_array);
> > u64 tmp;
> > int i;
> >
> > for (i = 0; i < this_cpu_read(mce_num_banks); i++) {
> > struct mce_bank *b = &mce_banks[i];
> >
> > if (b->init) {
> > wrmsrl(msr_ops.ctl(i), b->ctl);
> > wrmsrl(msr_ops.status(i), 0);
> > rdmsrl(msr_ops.ctl(i), tmp);
> >
> > /* Check if any bits implemented in h/w */
> > b->init = !!tmp;
> > }
>
> ... except that we unconditionally set ->init to 1 in
> __mcheck_cpu_mce_banks_init() and I think we should query it. Btw, that
> name __mcheck_cpu_mce_banks_init() is hideous too. I'll fix those up. In
> the meantime, how does the below look like? The change is to tickle out
> from the hw whether some CTL bits stick and then use that to determine
> b->init setting:
>
> ---
> From: Yazen Ghannam <yazen.ghannam@xxxxxxx>
> Date: Tue, 30 Apr 2019 20:32:21 +0000
> Subject: [PATCH] x86/MCE: Determine MCA banks' init state properly
>
> The OS is expected to write all bits to MCA_CTL for each bank,
> thus enabling error reporting in all banks. However, some banks
> may be unused in which case the registers for such banks are
> Read-as-Zero/Writes-Ignored. Also, the OS may avoid setting some control
> bits because of quirks, etc.
>
> A bank can be considered uninitialized if the MCA_CTL register returns
> zero. This is because either the OS did not write anything or because
> the hardware is enforcing RAZ/WI for the bank.
>
> Set a bank's init value based on if the control bits are set or not in
> hardware. Return an error code in the sysfs interface for uninitialized
> banks.
>
> [ bp: Massage a bit. Discover bank init state at boot. ]
>
> Signed-off-by: Yazen Ghannam <yazen.ghannam@xxxxxxx>
> Signed-off-by: Borislav Petkov <bp@xxxxxxx>
> Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: "linux-edac@xxxxxxxxxxxxxxx" <linux-edac@xxxxxxxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Tony Luck <tony.luck@xxxxxxxxx>
> Cc: "x86@xxxxxxxxxx" <x86@xxxxxxxxxx>
> Link: https://lkml.kernel.org/r/20190430203206.104163-7-Yazen.Ghannam@xxxxxxx
> ---
> arch/x86/kernel/cpu/mce/core.c | 23 ++++++++++++++++++-----
> 1 file changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 5bcecadcf4d9..d84b0c707d0e 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -1492,9 +1492,16 @@ static int __mcheck_cpu_mce_banks_init(void)
>
> for (i = 0; i < n_banks; i++) {
> struct mce_bank *b = &mce_banks[i];
> + u64 val;
>
> b->ctl = -1ULL;
> - b->init = 1;
> +
> + /* Check if any bits are implemented in h/w */
> + wrmsrl(msr_ops.ctl(i), b->ctl);
> + rdmsrl(msr_ops.ctl(i), val);
> + b->init = !!val;
> +
> + wrmsrl(msr_ops.status(i), 0);
> }

I think there are a couple of issues here.
1) The bank is being initialized without accounting for any quirks.
2) The bank is being initialized without having set up any handler or other appropriate setup.

Thanks,
Yazen