Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode

From: Adrian Hunter
Date: Thu Aug 21 2025 - 02:52:38 EST


On 20/08/2025 20:56, Yazen Ghannam wrote:
> On Wed, Aug 20, 2025 at 04:12:28PM +0000, Luck, Tony wrote:
>>>>> For struct mce? Maybe that should be 2 new fields:
>>>>>
>>>>> __u64 addr; /* Deprecated */
>>>>> ...
>>>>> __u64 mci_addr; /* Bank's MCi_ADDR MSR */
>>>>> __u64 phys_addr; /* Physical address */
>>>>
>>>> Would "addr" keep the current (low bits masked, high bits preserved) value?
>>>
>>> Yeah, it wouldn't make much sense if phys_addr was the same as addr anyway.
>>> Not really thinking
>>
>> The other option (but a bad one) would be:
>>
>> __u64 deprecated; /* was "addr" */
>> ...
>> __u64 mci_addr; /* Bank's MCi_ADDR MSR */
>> __u64 phys_addr; /* Physical address */
>>
>> which would be good to force cleanup in the kernel, but bad for preserving
>> ABI (since "struct mce" is visible to user space via /dev/mcelog).
>>
>
> /dev/mcelog has been deprecated for a while.

There is also mce_record tracepoint

>
> Is the mcelog app still in active development? Could it be updated to
> use trace events for MCE info?
>
> You could also just fix up the address value in the mcelog notifier's
> copy. I believe it has its own cache separate from the MCE genpool.

Is there an advantage to fixing up later rather than when addr is
initially assigned?