Re: [PATCH v3] x86/mce: retrieve poison range from hardware

From: Jane Chu
Date: Mon Jul 18 2022 - 17:12:10 EST


On 7/18/2022 12:22 PM, Luck, Tony wrote:
>> It appears the kernel is trusting that ->physical_addr_mask is non-zero
>> in other paths. So this is at least equally broken in the presence of a
>> broken BIOS. The impact is potentially larger though with this change,
>> so it might be a good follow-on patch to make sure that
>> ->physical_addr_mask gets fixed up to a minimum mask value.
>
> Agreed. Separate patch to sanitize early, so other kernel code can just use it.
>

Is it possible that with
if (mem->validation_bits & CPER_MEM_VALID_PA_MASK)
the ->physical_addr_mask is still untrustworthy?

include/ras/ras_event.h has this
if (mem->validation_bits & CPER_MEM_VALID_PA_MASK)
__entry->pa_mask_lsb =
(u8)__ffs64(mem->physical_addr_mask);
else
__entry->pa_mask_lsb = ~0;
which hints otherwise.

apei_mce_report_mem_error() already checks mem->validation_bits
up front.

thanks!
-jane


> -Tony