Re: [PATCH v3 4/7] acpi/ghes: Add a logic to handle block addresses and FW first ARM processor error injection
From: Mauro Carvalho Chehab
Date: Thu Aug 01 2024 - 10:34:27 EST
Em Mon, 29 Jul 2024 16:32:41 +0200
Markus Armbruster <armbru@xxxxxxxxxx> escreveu:
> Yes, as this CPER record is defined only for arm. There are three other
> > processor error info:
> > - for x86;
> > - for ia32;
> > - for "generic cpu".
> >
> > They have different structures, with different fields.
>
> A generic inject-error command feels nicer, but coding its arguments in
> the schema could be more trouble than it's worth. I'm not asking you to
> try.
>
> A target-specific command like this one should be conditional. Try
> this:
>
> { 'command': 'arm-inject-error',
> 'data': { 'errortypes': ['ArmProcessorErrorType'] },
> 'features': [ 'unstable' ],
> 'if': 'TARGET_ARM' }
>
> No need to provide a qmp_arm_inject_error() stub then.
I tried it, but it generates lots of poison errors. Basically, QAPI
generation includes poison.h, making it to complain about on
non-ARM builds.
Anyway, the new version I'm about to submit is not dependent on
ARM anymore (as it is a generic GHES error injection that can be used
by any arch).
Still, as I added a Kconfig symbol for it, I still needed a stub.
It would be cool not needing it, but on the other hand it doesn't
hurt much.
> >> If we encode the the error to inject as an enum value, adding more will
> >> be hard.
> >>
> >> If we wrap the enum in a struct
> >>
> >> { 'struct': 'ArmProcessorError',
> >> 'data': { 'type': 'ArmProcessorErrorType' } }
> >>
> >> we can later extend it like
> >>
> >> { 'union': 'ArmProcessorError',
> >> 'base: { 'type': 'ArmProcessorErrorType' }
> >> 'data': {
> >> 'bus-error': 'ArmProcessorBusErrorData' } }
> >>
> >> { 'struct': 'ArmProcessorBusErrorData',
> >> 'data': ... }
> >
> > I don't see this working as one might expect. See, the ARM error
> > information data can be repeated from 1 to 255 times. It is given
> > by this struct (see patch 7):
> >
> > { 'struct': 'ArmProcessorErrorInformation',
> > 'data': { '*validation': ['ArmPeiValidationBits'],
> > 'type': ['ArmProcessorErrorType'],
> > '*multiple-error': 'uint16',
> > '*flags': ['ArmProcessorFlags'],
> > '*error-info': 'uint64',
> > '*virt-addr': 'uint64',
> > '*phy-addr': 'uint64'}
> > }
> >
> > According with the UEFI spec, the type is always be present.
> > The other fields are marked as valid or not via the field
> > "validation". So, there's one bit indicating what is valid between
> > the fields at the PEI structure, e. g.:
> >
> > - multiple-error: multiple occurrences of the error;
> > - flags;
> > - error-info: error information;
> > - virt-addr: virtual address;
> > - phy-addr: physical address.
> >
> > There are also other fields that are global for the entire record,
> > also marked as valid or not via another bitmask.
> >
> > The contents of almost all those fields are independent of the error
> > type. The only field which content is affected by the error type is
> > "error-info", and the definition of such field is not fully specified.
> >
> > So, currently, UEFI spec only defines it when:
> >
> > 1. the error type has just one bit set;
> > 2. the error type is either cache, TLB or bus error[1].
> > If type is micro-arch-specific error, the spec doesn't tell how this
> > field if filled.
> >
> > To make the API simple (yet powerful), I opted to not enforce any encoding
> > for error-info: let userspace fill it as required and use some default
> > that would make sense, if this is not passed via QMP.
> >
> > [1] See https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-information
>
> I asked because designing for extensibility is good practice.
>
> It's not a hard requirement here, because feature 'unstable' gives us
> lincense to change the interface incompatibly.
IMO keeping it as unstable makes sense, as this QAPI is specific for
error injection, which is hardly a feature widely used. Also, with the
script approach, the actual CPER record generation happens on a script.
If we provide it together with QEMU, if the QAPI ever changes, the
changes inside the script will happen altogether. So, IMO, no need to
make it stable.
Thanks,
Mauro