Re: [PATCH v4] cper, apei, mce: Pass x86 CPER through the MCA handling chain
From: Punit Agrawal
Date: Mon Sep 28 2020 - 04:07:18 EST
Yazen Ghannam <yazen.ghannam@xxxxxxx> writes:
> On Fri, Sep 25, 2020 at 09:54:06AM +0900, Punit Agrawal wrote:
>> Borislav Petkov <bp@xxxxxxxxx> writes:
>>
>> > On Thu, Sep 24, 2020 at 12:23:27PM -0500, Smita Koralahalli Channabasappa wrote:
>> >> > Even though it's not defined in the UEFI spec, it doesn't mean a
>> >> > structure definition cannot be created.
>> >
>> > Created for what? That structure better have a big fat comment above it, what
>> > firmware generates its layout.
>>
>> Maybe I could've used a better choice of words - I meant to define a
>> structure with meaningful member names to replace the *(ptr + i)
>> accesses in the patch.
>>
>> The requirement for documenting the record layout doesn't change -
>> whether using raw pointer arithmetic vs a structure definition.
>>
>> >> > After all, the patch is relying on some guarantee of the meaning of
>> >> > the values and their ordering.
>> >
>> > AFAICT, this looks like an ad-hoc definition and the moment they change
>> > it in some future revision, that struct of yours becomes invalid so we'd
>> > need to add another one.
>>
>> If there's no spec backing the current layout, then it'll indeed be an
>> ad-hoc definition of a structure in the kernel. But considering that
>> it's part of firmware / OS interface for an important part of the RAS
>> story I would hope that the code is based on a spec - having that
>> reference included would help maintainability.
>>
>> Incompatible changes will indeed break the assumptions in the kernel and
>> code will need to be updated - regardless of the choice of kernel
>> implementation; pointer arithmetic, structure definition - ad-hoc or
>> spec provided.
>>
>> Having versioning will allow running older kernels on newer hardware and
>> vice versa - but I don't see why that is important only when using a
>> structure based access.
>>
>
> There is no versioning option for the x86 context info structure in the
> UEFI spec, so I don't think there'd be a clean way to include version
> information.
>
> The format of the data in the context info is not totally ad-hoc, and it
> does follow the UEFI spec. The "Register Array" field is raw data. This
> may follow one of the predefined formats in the UEFI spec like the "X64
> Register State", etc. Or, in the case of MSR and Memory Mapped
> Registers, this is a raw dump of the registers starting from the address
> shown in the structure. The two values that can be changed are the
> starting address and the array size. These two together provide a window
> to the registers. The registers are fixed, so a single context info
> struture should include a single contiguous range of registers. Multiple
> context info structures can be provided to include registers from
> different, non-contiguous ranges.
>
> This patch is checking if an MSR context info structure lines up with
> the MCAX register space used on Scalable MCA systems. This register
> space is defined in the AMD Processor Programming Reference for various
> products. This is considered a hardware feature extension, so the
> existing register layout won't change though new registers may be added.
> A layout change would require moving to another register space which is
> what happened going from legacy MCA (starting at address 0x400) to MCAX
> (starting at address 0xC0002000) registers.
Thanks for the SMCA related background.
>
> The only two things firmware can change are from what address does the
> info start and where does the info end. So the implementation-specific
> details here are that currently the starting address is MCA_STATUS (in
> MCAX space) for a bank and the remaining info includes the other MCA
> registers for this bank.
>
> So I think the kernel can be strict with this format, i.e. the two
> variables match what we're looking for. This patch already has a check
> on the starting address. It should also include a check that "Register
> Array Size" is large enough to include all the registers we want to
> extract. If the format doesn't match, then we fall back to a raw dump
> of the data like we have today.
>
> Or the kernel can be more flexible and try to find the window of
> registers based on the starting address. I think this is really
> open-ended though.
I think I understand the hesitancy here if the firmware can arbitrarily
move the starting address. Though I hope that doesn't happen as it would
break the feature introduced in $SUBJECT.
The way I read the code / spec led me to believe that the MSR context
info records in the SMCA space are just encoding the layout of MC Bank
registers[0] and making it explicit can only help.
But Boris seems to think the current approach is good enough. So no
objections from me.
Thanks,
Punit
[0] AMD Processor Programming Reference for Family 17H, Sec 3.1.5