Re: [PATCH v2 1/3] PCI/AER: Fix incorrect return fromaer_hest_parse()

From: Betty Dall
Date: Mon Jun 03 2013 - 17:18:46 EST


On Sat, 2013-06-01 at 18:38 -0600, Bjorn Helgaas wrote:
> [+cc Bob for ACPI HEST spec questions]
>
> On Thu, May 30, 2013 at 8:39 AM, Betty Dall <betty.dall@xxxxxx> wrote:
> > The function aer_hest_parse() is called to determine if the given
> > PCI device is firmware first or not. The code loops through each
> > section of the HEST table to look for a match. The bug is that
> > the function always returns whether the last HEST section is firmware
> > first. The fix stops the iteration once the info.firmware_first
> > variable is set. This is similar to how the function aer_hest_parse_aff()
> > stops the iteration.
> >
> > Signed-off-by: Betty Dall <betty.dall@xxxxxx>
> > ---
> >
> > drivers/pci/pcie/aer/aerdrv_acpi.c | 3 +++
> > 1 files changed, 3 insertions(+), 0 deletions(-)
> >
> >
> > diff --git a/drivers/pci/pcie/aer/aerdrv_acpi.c b/drivers/pci/pcie/aer/aerdrv_acpi.c
> > index 5194a7d..39b8671 100644
> > --- a/drivers/pci/pcie/aer/aerdrv_acpi.c
> > +++ b/drivers/pci/pcie/aer/aerdrv_acpi.c
> > @@ -42,6 +42,9 @@ static int aer_hest_parse(struct acpi_hest_header *hest_hdr, void *data)
> > u8 bridge = 0;
> > int ff = 0;
> >
> > + if (info->firmware_first)
> > + return 0;
> > +
> > switch (hest_hdr->type) {
> > case ACPI_HEST_TYPE_AER_ROOT_PORT:
> > pcie_type = PCI_EXP_TYPE_ROOT_PORT;
>
> Not related directly to your patch, Betty, but I can't figure out why
> the ACPI spec defines the HEST structures for PCIe as it does. I'm
> looking at ACPI 5.0, sec 18.3.2.3 - 18.3.2.5.

> 1) The PCIe Root Port, PCIe Device, and PCIe/PCI-X Bridge structures
> all include Bus, Device, and Function fields. But there's no Segment.
> The current Linux code (hest_match_pci()) assumes HEST records can
> only apply to PCI domain 0. Is Linux missing something, or is the
> HEST really this limited?
You are right that the HEST table does not have the Segment for the PCIe
sources. The Linux code uses the Generic Source type that points to a
UEFI CPER record. Those do have the Segment. The code in
acpi/apei/ghes.c that parses the HEST and invokes the
aer_recover_queue() is using the segment from the CPER record.

The hest_match_pci() code is only looking at the PCIe error source types
because that is where the flags field is defined to indicate "firmware
first". Flags is not defined in the Generic error source. Firmware
first platforms today must be using GLOBAL error sources because the
code is written in such a way that any specific PCIe match will cause
the entire platform to be firmware first. Linux only supports either AER
or firmware first right now, and it would be an enhancement to do both.

> 2) Doesn't the fact that the HEST structures include a bus number mean
> that the OS cannot reassign bus numbers? I guess maybe we could still
> reassign them if we kept track of the mapping back to the original
> values.
Yes, I think there needs to be a mapping between the HEST/CPER bus
number and the OS assigned bus numbers, if the OS is reassigning bus
numbers.

> 3) It's interesting that the only choices seem to be "global" or "list
> every device." There's no "apply to this device and the subtree under
> it," so I guess if you want HEST to apply to hot-added devices,
> "global" is the only thing that makes sense?
"Global" is the one one that makes sense to me too. The size of the HEST
table is fixed at boot and firmware strives to keep it as small as
possible because firmware memory is a scarce resource.

Betty

> Bjorn


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/