Re: [PATCH v2 06/16] x86/mce: Remove __mcheck_cpu_init_early()

From: Yazen Ghannam
Date: Thu Feb 27 2025 - 14:59:51 EST


On Thu, Feb 27, 2025 at 08:33:19PM +0100, Borislav Petkov wrote:
> On February 27, 2025 5:31:48 PM GMT+01:00, Yazen Ghannam <yazen.ghannam@xxxxxxx> wrote:
> >On Thu, Feb 27, 2025 at 04:25:00PM +0100, Borislav Petkov wrote:
> >> On Thu, Feb 13, 2025 at 04:45:55PM +0000, Yazen Ghannam wrote:
> >> > Also, move __mcheck_cpu_init_generic() after
> >> > __mcheck_cpu_init_prepare_banks() so that MCA is enabled after the first
> >> > MCA polling event.
> >>
> >> The reason being?
> >>
> >> Precaution?
> >>
> >> It was this way since forever, why are you moving it now? Any particular
> >> reason?
> >>
> >
> >1) To read/clear old errors before turning on MCA. The updated
> >__mcheck_cpu_init_prepare_banks() function does this for the MCi_CTL
> >registers. This patch does this for the MCG_CTL register too.
> >
> >2) To ensure that vendor-specific setup is finished beforehand also.
>
> That doesn't answer my question. All of the above gets done even without shuffling the order...
>
>

MCA banks can start logging errors once MCG_CTL is set. The AMD docs say
"The operating system must initialize the MCA_CONFIG registers prior to
initialization of the MCA_CTL registers."

"The MCA_CTL registers must be initialized prior to enabling the error
reporting banks in MCG_CTL".

However, the Intel docs "Machine-Check Initialization Pseudocode" say
MCG_CTL first then MCi_CTL.

But both agree that CR4.MCE should be set last.

We have an old thread on the topic that led to this patch.
https://lore.kernel.org/all/YqJHwXkg3Ny9fI3s@yaz-fattaah/

And it seemed okay at the time.
https://lore.kernel.org/all/YrnTMmwl5TrHwT9J@xxxxxxx/

I don't think anything much has changed since then, so I included the
old patch again in this set.

Thanks,
Yazen