Re: [PATCH] x86/mce: add CMCI support for Centaur CPUs
From: David Wang
Date: Sun Jun 03 2018 - 22:10:21 EST
> -----Original Mail-----
> Sender: Borislav Petkov [mailto:bp@xxxxxxxxx]
> Time: 2018å6æ1æ 17:38
> Receiver: David Wang <davidwang@xxxxxxxxxxx>
> CC: tony.luck@xxxxxxxxx; mingo@xxxxxxxxxx; tglx@xxxxxxxxxxxxx;
> hpa@xxxxxxxxx; gregkh@xxxxxxxxxxxxxxxxxx; x86@xxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; linux-edac@xxxxxxxxxxxxxxx;
> brucechang@xxxxxxxxxxxxxxxx; cooperyan@xxxxxxxxxxx;
> qiyuanwang@xxxxxxxxxxx; benjaminpan@xxxxxxxxxxx; lukelin@xxxxxxxxxx;
> timguo@xxxxxxxxxxx
> Topic : Re: [PATCH] x86/mce: add CMCI support for Centaur CPUs
>
> On Thu, May 31, 2018 at 11:28:58AM +0800, David Wang wrote:
> > Newer Centaur support CMCI mechanism, which is compatible with INTEL
> CMCI.
> >
> > Signed-off-by: David Wang <davidwang@xxxxxxxxxxx>
> > ---
> > arch/x86/Kconfig | 12 ++++++++++++
> > arch/x86/kernel/cpu/mcheck/mce.c | 6 ++++++
> > 2 files changed, 18 insertions(+)
> >
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index
> > dda87a3..1adff5f 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -1130,6 +1130,18 @@ config X86_MCE_AMD
> > Additional support for AMD specific MCE features such as
> > the DRAM Error Threshold.
> >
> > +config X86_MCE_CENTAUR
> > + def_bool y
> > + prompt "CENTAUR MCE features"
> > + depends on CPU_SUP_CENTAUR && X86_MCE_INTEL
> > + help
> > + Additional support for Centaur specific MCE features such as
> > + MCE broadcasting and CMCI support.
> > + New Centaur CPU support MCE broadcasting.
> > + New Centaur CPU support CMCI which is fully compliant with Intel
> CMCI.
> > +
> > + If unsure, say N here.
> > +
> > config X86_ANCIENT_MCE
> > bool "Support for old Pentium 5 / WinChip machine checks"
> > depends on X86_32 && X86_MCE
> > diff --git a/arch/x86/kernel/cpu/mcheck/mce.c
> > b/arch/x86/kernel/cpu/mcheck/mce.c
> > index cd76380..2ebafc7 100644
> > --- a/arch/x86/kernel/cpu/mcheck/mce.c
> > +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> > @@ -1727,6 +1727,7 @@ static void __mcheck_cpu_init_early(struct
> cpuinfo_x86 *c)
> > }
> > }
> >
> > +#ifdef CONFIG_X86_MCE_CENTAUR
> > static void mce_centaur_feature_init(struct cpuinfo_x86 *c) {
> > struct mca_config *cfg = &mca_cfg;
> > @@ -1740,7 +1741,12 @@ static void mce_centaur_feature_init(struct
> cpuinfo_x86 *c)
> > if (cfg->monarch_timeout < 0)
> > cfg->monarch_timeout = USEC_PER_SEC;
> > }
> > + mce_intel_feature_init(c);
> > + mce_adjust_timer = cmci_intel_adjust_timer;
>
> So far so good but above says "New Centaur CPU[s]" but you're calling
> mce_intel_feature_init() unconditionally here, for *all* centaur CPUs.
> Ditto for setting the CMCI handler.
There is a check for CMCI support or not In cmci_supported() function which will be called by mce_intel_feature_init.
return !!(cap & MCG_CMCI_P);
So, I didn't add Family/Model/Stepping check.
But, I am sorry to add another patch just as following:
@@ -85,7 +85,8 @@ static int cmci_supported(int *banks)
* initialization is vendor keyed and this
* makes sure none of the backdoors are entered otherwise.
*/
- if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
+ if ((boot_cpu_data.x86_vendor != X86_VENDOR_INTEL &&
+ boot_cpu_data.x86_vendor != X86_VENDOR_CENTAUR))
return 0;
if (!boot_cpu_has(X86_FEATURE_APIC) || lapic_get_maxlvt() < 6)
return 0;
I will send patch v4 later to include this patch.
Thank you.
>
> Also mce_intel_feature_init() does more things than init CMCI so I think you
> wanna limit that to only intel_init_cmci(). IOW, something like the hunk below.
>
> And frankly I'm not crazy about adding centaur code to mce_intel.c but since it
> is copying functionality, I think we should copy the sw support in the kernel too
> instead of adding a mce_centaur.c duplicate. For now at least...
>
> Thx.
>
> ---
> diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c
> b/arch/x86/kernel/cpu/mcheck/mce_intel.c
> index d05be307d081..77d8a9b996a6 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
> @@ -506,8 +506,15 @@ static void intel_ppin_init(struct cpuinfo_x86 *c)
>
> void mce_intel_feature_init(struct cpuinfo_x86 *c) {
> - intel_init_thermal(c);
> intel_init_cmci();
> +
> + /*
> + * Some Centaur variants support CMCI.
> + */
> + if (c->x86_vendor == X86_VENDOR_CENTAUR)
> + return;
> +
> + intel_init_thermal(c);
> intel_init_lmce();
> intel_ppin_init(c);
> }
>
OK. I will send patch V4 to fix this problem, just like:
void mce_intel_feature_init(struct cpuinfo_x86 *c)
{
- intel_init_thermal(c);
- intel_init_cmci();
- intel_init_lmce();
- intel_ppin_init(c);
+
+ switch (c->x86_vendor) {
+ case X86_VENDOR_INTEL:
+ intel_init_thermal(c);
+ intel_init_cmci();
+ intel_init_lmce();
+ intel_ppin_init(c);
+ break;
+ case X86_VENDOR_CENTAUR:
+ intel_init_cmci();
+ break;
+ default:
+ break;
+ }
}
Thank you.
> --
> Regards/Gruss,
> Boris.
>
> ECO tip #101: Trim your mails when you reply.
> --