Re: [PATCH v3 4/7] acpi/ghes: Add a logic to handle block addresses and FW first ARM processor error injection
From: Mauro Carvalho Chehab
Date: Mon Jul 29 2024 - 08:49:14 EST
Em Fri, 26 Jul 2024 13:46:46 +0100
Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> escreveu:
> A few quick replies from me.
> I'm sure Mauro will add more info.
>
> > > + 'tlb-error',
> > > + 'bus-error',
> > > + 'micro-arch-error']
> > > +}
> > > +
> > > +##
> > > +# @arm-inject-error:
> > > +#
> > > +# Inject ARM Processor error.
> > > +#
> > > +# @errortypes: ARM processor error types to inject
> > > +#
> > > +# Features:
> > > +#
> > > +# @unstable: This command is experimental.
> > > +#
> > > +# Since: 9.1
> > > +##
> > > +{ 'command': 'arm-inject-error',
> > > + 'data': { 'errortypes': ['ArmProcessorErrorType'] },
> >
> > Please separate words with dashes: 'error-types'.
> >
> > > + 'features': [ 'unstable' ]
> > > +}
> >
> > Is this used only with TARGET_ARM?
> >
> > Why is being able to inject multiple error types at once useful?
>
> It pokes a weird corner of the specification that I think previously
> tripped up Linux.
>
> >
> > I'd expect at least some of these errors to come with additional
> > information. For instance, I imagine a bus error is associated with
> > some address.
>
> Absolutely agree that in sane case you wouldn't have multiple errors
> but we want to hit the insane ones :(
Yes.
> There is only prevision for one set of data in the record despite
> it providing a bitmap for the type of error.
Well, there isn't anything at the UEFI forbidding to use multiple bits.
On a "normal" field with a bitmask, more than one bit set is supported.
So, as spec doesn't deny it, it should be valid to have more than one
bits filled.
Now, when multiple errors bits from this table are set:
+-----|---------------------------+
| Bit | Meaning |
+=====+===========================+
| 1 | Cache Error |
| 2 | TLB Error |
| 3 | Bus Error |
| 4 | Micro-architectural Error |
+-----|---------------------------+
- if bit 4 is set, as specified at the spec, the error-info field is
defined by the ARM vendor, according with:
"N.2.4.4.1.1. ARM Vendor Specific Micro-Architecture ErrorStructure
This is a vendor specific structure. Please refer to your hardware
vendor documentation for the format of this structure."
So, provided that the vendor-specific documentation explicitly allows
setting bit 4 with other bits, I don't see an UEFI compliance problem.
- if bit 4 is not set, but multiple bits 1 to 3 are set, the content
of error-info is currently undefined, as tables N.18 to N.20 won't
apply.
Anyway, from spec PoV, IMO UEFI API requires an errata to clearly enforce
that just one bit should be set or to define the behavior when multiple
ones are set.
Thanks,
Mauro