Re: [PATCH v15 0/2] ACPI / APEI: Add support to notify the vendor specific HW errors

From: Rafael J. Wysocki
Date: Tue Sep 15 2020 - 14:45:34 EST


On Mon, Sep 14, 2020 at 2:34 PM Shiju Jose <shiju.jose@xxxxxxxxxx> wrote:
>
> Hello,
>
> Can you help to merge this series?

Do you want this series to go in through the ACPI tree?

> >-----Original Message-----
> >From: Linuxarm [mailto:linuxarm-bounces@xxxxxxxxxx] On Behalf Of Shiju
> >Jose
> >Sent: 03 September 2020 13:35
> >To: linux-acpi@xxxxxxxxxxxxxxx; linux-pci@xxxxxxxxxxxxxxx; linux-
> >kernel@xxxxxxxxxxxxxxx; rjw@xxxxxxxxxxxxx; helgaas@xxxxxxxxxx;
> >bp@xxxxxxxxx; james.morse@xxxxxxx; lorenzo.pieralisi@xxxxxxx;
> >robh@xxxxxxxxxx; lenb@xxxxxxxxxx; tony.luck@xxxxxxxxx;
> >dan.carpenter@xxxxxxxxxx; andriy.shevchenko@xxxxxxxxxxxxxxx
> >Cc: Linuxarm <linuxarm@xxxxxxxxxx>
> >Subject: [PATCH v15 0/2] ACPI / APEI: Add support to notify the vendor
> >specific HW errors
> >
> >CPER records describing a firmware-first error are identified by GUID.
> >The ghes driver currently logs, but ignores any unknown CPER records.
> >This prevents describing errors that can't be represented by a standard entry,
> >that would otherwise allow a driver to recover from an error.
> >The UEFI spec calls these 'Non-standard Section Body' (N.2.3 of version 2.8).
> >
> >patch set
> >1. add the notifier chain for these non-standard/vendor-records
> > in the ghes driver.
> >
> >2. add the driver to handle HiSilicon HIP PCIe controller's errors.
> >
> >Changes:
> >
> >V15:
> >1. Change in the HIP PCIe error handling driver
> > for a comment by Andy Shevchenko.
> > Removed "depends on ACPI" as it already depends on
> > it through ACPI_APEI_GHES.
> >
> >V14:
> >1. Add patch[1] posted by James to the series.
> >
> >2. Following changes made for Bjorn's comments,
> >2.1 Deleted stub code from ghes.h
> >2.2 Made CONFIG_PCIE_HISI_ERR depend on CONFIG_ACPI_APEI_GHES.
> >
> >V13:
> >1. Following changes in the HIP PCIe error handling driver.
> >1.1 Add Bjorn's acked-by.
> >1.2. Address the comments and macros order Bjorn mentioned.
> > Fix the words in the commit.
> >
> >V12:
> >1. Changed the Signed-off-by tag to Co-developed-by tag in the patch
> > "ACPI / APEI: Add a notifier chain for unknown (vendor) CPER records"
> >
> >V11:
> >1. Following modifications made by James Morse in the APEI patch
> > for the vendor error record.
> > - Removed kfifo and ghes_gdata_pool. Expanded commit message.
> >
> >2. Changes in the HIP PCIe error handling driver
> > for the comments by Andy Shevchenko.
> >
> >V10:
> >1. Changes for Bjorn's comments on HIP PCIe error handler driver
> > and APEI patch.
> >
> >2. Changes in the HIP PCIe error handler driver
> > for the feedbacks by Andy Shevchenko.
> >
> >V9:
> >1. Fixed 2 improvements suggested by the kbuild test robot.
> >1.1 Change ghes_gdata_pool_init() as static function.
> >1.2. Removed using buffer to store the error data for
> > logging in the hisi_pcie_handle_error()
> >
> >V8:
> >1. Removed reporting the standard errors through the interface
> > because of the conflict with the recent patches in the
> > memory error handling path.
> >2. Fix comments by Dan Carpenter.
> >
> >V7:
> >1. Add changes in the APEI driver suggested by Borislav Petkov, for
> > queuing up all the non-fatal HW errors to the work queue and
> > notify the registered kernel drivers from the bottom half using
> > blocking notifier, common interface for both standard and
> > vendor-spcific errors.
> >2. Fix for further feedbacks in v5 HIP PCIe error handler driver
> > by Bjorn Helgaas.
> >
> >V6:
> >1. Fix few changes in the patch subject line suggested by Bjorn Helgaas.
> >
> >V5:
> >1. Fix comments from James Morse.
> >1.1 Changed the notification method to use the atomic_notifier_chain.
> >1.2 Add the error handled status for the user space.
> >
> >V4:
> >1. Fix for the following smatch warning in the PCIe error driver,
> > reported by kbuild test robot<lkp@xxxxxxxxx>:
> > warn: should '((((1))) << (9 + i))' be a 64 bit type?
> > if (err->val_bits & BIT(HISI_PCIE_LOCAL_VALID_ERR_MISC + i))
> > ^^^ This should be BIT_ULL() because it goes up to 9 + 32.
> >
> >V3:
> >1. Fix the comments from Bjorn Helgaas.
> >
> >V2:
> >1. Changes in the HiSilicon PCIe controller's error handling driver
> > for the comments from Bjorn Helgaas.
> >
> >2. Changes in the APEI interface to support reporting the vendor error
> > for module with multiple devices, but use the same section type.
> > In the error handler will use socket id/sub module id etc to distinguish
> > the device.
> >
> >V1:
> >1. Fix comments from James Morse.
> >
> >2. add driver to handle HiSilicon hip08 PCIe controller's errors,
> > which is an application of the above interface.
> >
> >Shiju Jose (1):
> > ACPI / APEI: Add a notifier chain for unknown (vendor) CPER records
> >
> >Yicong Yang (1):
> > PCI: hip: Add handling of HiSilicon HIP PCIe controller errors
> >
> > drivers/acpi/apei/ghes.c | 63 +++++
> > drivers/pci/controller/Kconfig | 7 +
> > drivers/pci/controller/Makefile | 1 +
> > drivers/pci/controller/pcie-hisi-error.c | 327 +++++++++++++++++++++++
> > include/acpi/ghes.h | 18 ++
> > 5 files changed, 416 insertions(+)
> > create mode 100644 drivers/pci/controller/pcie-hisi-error.c
> >
> >--
> >2.17.1
> >
> >
> >_______________________________________________
> >Linuxarm mailing list
> >Linuxarm@xxxxxxxxxx
> >http://hulk.huawei.com/mailman/listinfo/linuxarm