Re: [PATCH v2 1/2] cper, apei, mce: Pass x86 CPER through the MCA handling chain

From: Yazen Ghannam
Date: Tue Sep 01 2020 - 15:36:37 EST


On Fri, Aug 28, 2020 at 03:33:31PM -0500, Smita Koralahalli wrote:
...
> +int apei_mce_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
> +{
> + const u64 *i_mce = ((const void *) (ctx_info + 1));
> + unsigned int cpu;
> + struct mce m;
> +
> + if (!boot_cpu_has(X86_FEATURE_SMCA))
> + return -EINVAL;
> +

This function is called on any context type, but it can only decode
"MSR" types that follow the MCAX register layout used on Scalable MCA
systems.

So I think there should be a couple of checks added:
1) Context type is "MSR".
2) Register layout follows what is expected below. There's no explict
way to do this, since the data is implemenation-specific. But at least
there can be a check that the starting MSR address matches the first
expected register: Bank's MCA_STATUS in MCAX space (0xC0002XX1).

For example:

(ctx_info->msr_addr & 0xC0002001) == 0xC0002001

The raw value in the example should be defined with a name.

> + mce_setup(&m);
> +
> + m.extcpu = -1;
> + m.socketid = -1;
> +
> + for_each_possible_cpu(cpu) {
> + if (cpu_data(cpu).initial_apicid == lapic_id) {
> + m.extcpu = cpu;
> + m.socketid = cpu_data(m.extcpu).phys_proc_id;
> + break;
> + }
> + }
> +
> + m.apicid = lapic_id;
> + m.bank = (ctx_info->msr_addr >> 4) & 0xFF;
> + m.status = *i_mce;
> + m.addr = *(i_mce + 1);
> + m.misc = *(i_mce + 2);
> + /* Skipping MCA_CONFIG */
> + m.ipid = *(i_mce + 4);
> + m.synd = *(i_mce + 5);
> +
> + mce_log(&m);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(apei_mce_report_x86_error);
> +

Thanks,
Yazen