Re: [PATCH v3 0/6] PCI/ACPI: Fix firmware first error recovery withroot port in reset

From: Bjorn Helgaas
Date: Thu Jun 06 2013 - 17:05:23 EST


On Thu, Jun 6, 2013 at 12:10 PM, Betty Dall <betty.dall@xxxxxx> wrote:
> This patch set fixes a bug on platforms that use firmware first AER.
> Firmware can leave the root port in Secondary Bus Reset (SBR) and
> communicate this to the OS through the "reset" bit in the flags field
> of the HEST table and associated CPER records. Firmware wants to do this
> so that the error is contained and the hardware is in a known state.
>
> Without these patches, the root port stays in SBR and the device drivers
> cannot recover. These patches recognize when the firmware first root port
> is in SBR and bring the root port out of SBR so the devices under the root
> port can recover.
>
> The changes have been tested on systems with firmware first that set the
> "reset" bit by injecting various hardware errors. The errors successfully
> recover.
>
> Changes since v1:
> Fixed a typo in the comment of patch 2.
> Removed incorrect setting of reset bit in patch 3.
>
> Changes since v2:
> The v2 patch 1/3 was re-written by Bjorn Helgaas and is now patches 1/6
> through 3/6.
> The v2 patch 2/3 is now 5/6 and changed to directly use the AER_FATAL define
> and introduced patch 4/6 to move the defines to a public header file.
> The v2 patch 3/3 is now 6/6 and uses the same default reset link function for
> both Downstream Ports and Root Ports.
>
> Signed-off-by: Betty Dall <betty.dall@xxxxxx>
> ---
> Betty Dall (6):
> PCI/AER: Don't parse HEST table for non-PCIe devices
> PCI/AER: Factor out HEST device type matching
> PCI/AER: Set dev->__aer_firmware_first only for matching devices
> PCI/ACPI: Move AER severity defines to aer.h
> ACPI/APEI: Force fatal AER severity when bus has been reset
> PCI/AER: Provide reset_link for firmware first root port
> ---
> drivers/acpi/apei/ghes.c | 10 +++++++
> drivers/pci/pcie/aer/aerdrv.h | 4 ---
> drivers/pci/pcie/aer/aerdrv_acpi.c | 47 ++++++++++++++++++-----------------
> drivers/pci/pcie/aer/aerdrv_core.c | 17 +++++++------
> include/linux/aer.h | 16 +++++++----
> 5 files changed, 53 insertions(+), 41 deletions(-)

I put these on http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/betty-aer-v3

I'll merge them into -next for v3.11 soon. Thanks, Betty!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/