Re: [PATCH v15 22/23] x86/mce: Improve error log of kernel space TDX #MC due to erratum

From: Borislav Petkov
Date: Tue Dec 05 2023 - 14:57:17 EST


On Tue, Dec 05, 2023 at 07:41:41PM +0000, Huang, Kai wrote:
> -static const char *mce_memory_info(struct mce *m)
> +static const char *mce_dump_aux_info(struct mce *m)
> {
> - if (!m || !mce_is_memory_error(m) || !mce_usable_address(m))
> - return NULL;
> -
> /*
> - * Certain initial generations of TDX-capable CPUs have an
> - * erratum. A kernel non-temporal partial write to TDX private
> - * memory poisons that memory, and a subsequent read of that
> - * memory triggers #MC.
> - *
> - * However such #MC caused by software cannot be distinguished
> - * from the real hardware #MC. Just print additional message
> - * to show such #MC may be result of the CPU erratum.
> + * Confidential computing platforms such as TDX platforms
> + * may occur MCE due to incorrect access to confidential
> + * memory. Print additional information for such error.
> */
> - if (!boot_cpu_has_bug(X86_BUG_TDX_PW_MCE))
> + if (!m || !mce_is_memory_error(m) || !mce_usable_address(m))
> return NULL;

What's the point of doing this on !TDX? None.

> - return !tdx_is_private_mem(m->addr) ? NULL :
> - "TDX private memory error. Possible kernel bug.";
> + if (platform_tdx_enabled())

So is this the "host is TDX" check?

Not a X86_FEATURE flag but something homegrown. And Kirill is trying to
switch the CC_ATTRs to X86_FEATURE_ flags for SEV but here you guys are
using something homegrown.

why not a X86_FEATURE_ flag?

The CC_ATTR things are for guests, I guess, but the host feature checks
should be X86_FEATURE_ flags things.

Hmmm.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette