Re: [PATCH v3 7/7] acpi/ghes: extend arm error injection logic

From: Mauro Carvalho Chehab
Date: Mon Jul 29 2024 - 07:19:07 EST


Em Thu, 25 Jul 2024 12:03:46 +0200
Markus Armbruster <armbru@xxxxxxxxxx> escreveu:

> Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx> writes:
>
> > Enrich CPER error injection logic for ARM processor to allow
> > setting values to from UEFI 2.10 tables N.16 and N.17.
> >
> > It should be noticed that, with such change, all arguments are
> > now optional, so, once QMP is negotiated with:
> >
> > { "execute": "qmp_capabilities" }
> >
> > the simplest way to generate a cache error is to use:
> >
> > { "execute": "arm-inject-error" }
> >
> > Also, as now PEI is mapped into an array, it is possible to
> > inject multiple errors at the same CPER record with:
> >
> > { "execute": "arm-inject-error", "arguments": {
> > "error": [ {"type": [ "cache-error" ]},
> > {"type": [ "tlb-error" ]} ] } }
> >
> > This would generate both cache and TLB errors, using default
> > values for other fields.
> >
> > As all fields from ARM Processor CPER are now mapped, all
> > types of CPER records can be generated with the new QAPI.
> >
> > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx>
>
> [...]
>
> > diff --git a/qapi/arm-error-inject.json b/qapi/arm-error-inject.json
> > index 430e6cea6b60..2a314830fe60 100644
> > --- a/qapi/arm-error-inject.json
> > +++ b/qapi/arm-error-inject.json
> > @@ -2,40 +2,258 @@
> > # vim: filetype=python
> >
> > ##
> > -# = ARM Processor Errors
> > +# = ARM Processor Errors as defined at:
> > +# https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html
> > +# See tables N.16, N.17 and N.21.
> > ##
>
> This comes out badly in HTML.
>
> Try something like
>
> # = ARM Processor Errors
> #
> # These are defined at
> # https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html
> # See tables N.16, N.17 and N.21.

Ok. I double-checked the results of both manpage and html on the
version I'm about to submit. The parsed macros should now be OK
for both.

>
> If any part of this is relevant in PATCH 4 already, squash the relevant
> parts into that patch please.

Ok. I ended squashing patch 7 with patch 4.

>
> >
> > +##
> > +# @ArmProcessorValidationBits:
> > +#
> > +# Indcates whether or not fields of ARM processor CPER record are valid.
>
> docs/devel/qapi-code-gen.rst section "Documentation markup":
>
> For legibility, wrap text paragraphs so every line is at most 70
> characters long.

Ok.

> > +#
> > +# @mpidr-valid: MPIDR Valid
> > +#
> > +# @affinity-valid: Error affinity level Valid
> > +#
> > +# @running-state-valid: Running State
> > +#
> > +# @vendor-specific-valid: Vendor Specific Info Valid
> > +#
> > +# Since: 9.1
> > +##
> > +{ 'enum': 'ArmProcessorValidationBits',
> > + 'data': ['mpidr-valid',
> > + 'affinity-valid',
> > + 'running-state-valid',
> > + 'vendor-specific-valid']
> > +}
> > +
> > +##
> > +# @ArmProcessorFlags:
> > +#
> > +# Indicates error attributes at the Error info section.
> > +#
> > +# @first-error-cap: First error captured
> > +#
> > +# @last-error-cap: Last error captured
> > +#
> > +# @propagated: Propagated
> > +#
> > +# @overflow: Overflow
> > +#
> > +# Since: 9.1
> > +##
> > +{ 'enum': 'ArmProcessorFlags',
> > + 'data': ['first-error-cap',
> > + 'last-error-cap',
> > + 'propagated',
> > + 'overflow']
> > +}
> > +
> > +##
> > +# @ArmProcessorRunningState:
> > +#
> > +# Indicates if the processor is running.
> > +#
> > +# @processor-running: indicates that the processor is running
> > +#
> > +# Since: 9.1
> > +##
> > +{ 'enum': 'ArmProcessorRunningState',
> > + 'data': ['processor-running']
> > +}
> > +
> > ##
> > # @ArmProcessorErrorType:
> > #
> > -# Type of ARM processor error to inject
> > -#
> > -# @unknown-error: Unknown error
> > +# Type of ARM processor error information to inject.
> > #
> > # @cache-error: Cache error
> > #
> > # @tlb-error: TLB error
> > #
> > -# @bus-error: Bus error.
> > +# @bus-error: Bus error
> > #
> > -# @micro-arch-error: Micro architectural error.
> > +# @micro-arch-error: Micro architectural error
> > #
> > # Since: 9.1
> > ##
> > { 'enum': 'ArmProcessorErrorType',
> > - 'data': ['unknown-error',
> > - 'cache-error',
> > + 'data': ['cache-error',
> > 'tlb-error',
> > 'bus-error',
> > 'micro-arch-error']
> > + }
>
> Squash the changes to this type into PATCH 4, please.

Ok.

> > +
> > +##
> > +# @ArmPeiValidationBits:
> > +#
> > +# Indcates whether or not fields of Processor Error Info section are valid.
> > +#
> > +# @multiple-error-valid: Information at multiple-error field is valid
> > +#
> > +# @flags-valid: Information at flags field is valid
> > +#
> > +# @error-info-valid: Information at error-info field is valid
> > +#
> > +# @virt-addr-valid: Information at virt-addr field is valid
> > +#
> > +# @phy-addr-valid: Information at phy-addr field is valid
> > +#
> > +# Since: 9.1
> > +##
> > +{ 'enum': 'ArmPeiValidationBits',
> > + 'data': ['multiple-error-valid',
> > + 'flags-valid',
> > + 'error-info-valid',
> > + 'virt-addr-valid',
> > + 'phy-addr-valid']
> > +}
> > +
> > +##
> > +# @ArmProcessorErrorInformation:
> > +#
> > +# Contains ARM processor error information (PEI) data according with UEFI
> > +# CPER table N.17.
> > +#
> > +# @validation:
> > +# Valid validation bits for error-info section.
> > +# Argument is optional. If not specified, those flags will be enabled:
> > +# first-error-cap and propagated.
>
> Please format like this for consistency:
>
> # @validation: Valid validation bits for error-info section.
> # Argument is optional. If not specified, those flags will be
> # enabled: first-error-cap and propagated.

Ok.

>
> > +#
> > +# @type:
> > +# ARM processor error types to inject. Argument is mandatory.
> > +#
> > +# @multiple-error:
> > +# Indicates whether multiple errors have occurred.
> > +# Argument is optional. If not specified and @validation not enforced,
> > +# this field will be marked as invalid at CPER record..
> > +#
> > +# @flags:
> > +# Indicates flags that describe the error attributes.
> > +# Argument is optional. If not specified and defaults to
> > +# first-error and propagated.
> > +#
> > +# @error-info:
> > +# Error information structure is specific to each error type.
> > +# Argument is optional, and its value depends on the PEI type(s).
> > +# If not defined, the default depends on the type:
> > +# - for cache-error: 0x0091000F;
> > +# - for tlb-error: 0x0054007F;
> > +# - for bus-error: 0x80D6460FFF;
> > +# - for micro-arch-error: 0x78DA03FF;
> > +# - if multiple types used, this bit is disabled from @validation bits.
> > +#
> > +# @virt-addr:
> > +# Virtual fault address associated with the error.
> > +# Argument is optional. If not specified and @validation not enforced,
> > +# this field will be marked as invalid at CPER record..
> > +#
> > +# @phy-addr:
> > +# Physical fault address associated with the error.
> > +# Argument is optional. If not specified and @validation not enforced,
> > +# this field will be marked as invalid at CPER record..
> > +#
> > +# Since: 9.1
> > +##
> > +{ 'struct': 'ArmProcessorErrorInformation',
> > + 'data': { '*validation': ['ArmPeiValidationBits'],
> > + 'type': ['ArmProcessorErrorType'],
> > + '*multiple-error': 'uint16',
> > + '*flags': ['ArmProcessorFlags'],
> > + '*error-info': 'uint64',
> > + '*virt-addr': 'uint64',
> > + '*phy-addr': 'uint64'}
> > +}
> > +
> > +##
> > +# @ArmProcessorContext:
> > +#
> > +# Provide processor context state specific to the ARM processor architecture,
> > +# According with UEFI 2.10 CPER table N.21.
> > +# Argument is optional.If not specified, no context will be used.
> > +#
> > +# @type:
> > +# Contains an integer value indicating the type of context state being
> > +# reported.
> > +# Argument is optional. If not defined, it will be set to be EL1 register
> > +# for the emulation, e. g.:
> > +# - on arm32: AArch32 EL1 context registers;
> > +# - on arm64: AArch64 EL1 context registers.
> > +#
> > +# @register:
> > +# Provides the contents of the actual registers or raw data, depending
> > +# on the context type.
> > +# Argument is optional. If not defined, it will fill the first register
> > +# with 0xDEADBEEF, and the other ones with zero.
> > +#
> > +# @minimal-size:
> > +# Argument is optional. If provided, define the minimal size of the
> > +# context register array. The actual size is defined by checking the
> > +# number of register values plus the content of this field (if used),
> > +# ensuring that each processor context information structure array is
> > +# padded with zeros if the size is not a multiple of 16 bytes.
> > +#
> > +# Since: 9.1
> > +##
> > +{ 'struct': 'ArmProcessorContext',
> > + 'data': { '*type': 'uint16',
> > + '*minimal-size': 'uint32',
> > + '*register': ['uint64']}
> > }
> >
> > ##
> > # @arm-inject-error:
> > #
> > -# Inject ARM Processor error.
> > +# Inject ARM Processor error with data to be filled accordign with UEFI 2.10
> > +# CPER table N.16.
> > #
> > -# @errortypes: ARM processor error types to inject
> > +# @validation:
> > +# Valid validation bits for ARM processor CPER.
> > +# Argument is optional. If not specified, the default is
> > +# calculated based on having the corresponding arguments filled.
> > +#
> > +# @affinity-level:
> > +# Error affinity level for errors that can be attributed to a specific
> > +# affinity level.
> > +# Argument is optional. If not specified and @validation not enforced,
> > +# this field will be marked as invalid at CPER record.
> > +#
> > +# @mpidr-el1:
> > +# Processor’s unique ID in the system.
> > +# Argument is optional. If not specified, it will use the cpu mpidr
> > +# field from the emulation data. If zero and @validation is not
> > +# enforced, this field will be marked as invalid at CPER record.
> > +#
> > +# @midr-el1: Identification info of the chip
> > +# Argument is optional. If not specified, it will use the cpu mpidr
> > +# field from the emulation data. If zero and @validation is not
> > +# enforced, this field will be marked as invalid at CPER record.
> > +#
> > +# @running-state:
> > +# Indicates the running state of the processor.
> > +# Argument is optional. If not specified and @validation not enforced,
> > +# this field will be marked as invalid at CPER record.
> > +#
> > +# @psci-state:
> > +# Provides PSCI state of the processor, as defined in ARM PSCI document.
> > +# Argument is optional. If not specified, it will use the cpu power
> > +# state field from the emulation data.
> > +#
> > +# @context:
> > +# Contains an array of processor context registers.
> > +# Argument is optional. If not specified, no context will be added.
> > +#
> > +# @vendor-specific:
> > +# Contains a byte array of vendor-specific data.
> > +# Argument is optional. If not specified, no vendor-specific data
> > +# will be added.
> > +#
> > +# @error:
> > +# Contains an array of ARM processor error information (PEI) sections.
> > +# Argument is optional. If not specified, defaults to a single
> > +# Program Error Information record defaulting to type=cache-error.
> > #
> > # Features:
> > #
> > @@ -44,6 +262,16 @@
> > # Since: 9.1
> > ##
> > { 'command': 'arm-inject-error',
> > - 'data': { 'errortypes': ['ArmProcessorErrorType'] },
> > + 'data': {
> > + '*validation': ['ArmProcessorValidationBits'],
> > + '*affinity-level': 'uint8',
> > + '*mpidr-el1': 'uint64',
> > + '*midr-el1': 'uint64',
> > + '*running-state': ['ArmProcessorRunningState'],
> > + '*psci-state': 'uint32',
> > + '*context': ['ArmProcessorContext'],
> > + '*vendor-specific': ['uint8'],
> > + '*error': ['ArmProcessorErrorInformation']
> > + },
> > 'features': [ 'unstable' ]
> > }
>
> This changes the command pretty much completely. Why is the previous
> state worth capturing in git?

I was thinking on having the first patch with minimal stuff and
letting patch 7 with everything, but after yours and Jonathan's
comments, I opted to merge them altogether.

>
> > diff --git a/tests/lcitool/libvirt-ci b/tests/lcitool/libvirt-ci
> > index 0e9490cebc72..77c800186f34 160000
> > --- a/tests/lcitool/libvirt-ci
> > +++ b/tests/lcitool/libvirt-ci
> > @@ -1 +1 @@
> > -Subproject commit 0e9490cebc726ef772b6c9e27dac32e7ae99f9b2
> > +Subproject commit 77c800186f34b21be7660750577cc5582a914deb
>
> Accident?
>

Yes. Working with submodules is sometimes tricky, as git commit -a wants
to merge everything including submodule changes, and manually dropping
submodule from existing commits is tricky. I added this to my environment,
but this affects only git diff porcelain:

[diff]
ignoreSubmodules = all

I wonder is are there ways for git commit -a to also ignore submodules...
perhaps some git hook?

Thanks,
Mauro