Re: [PATCH v5 2/2] PCI: HIP: Add handling of HiSilicon HIP PCIe controller errors

From: Bjorn Helgaas
Date: Wed Mar 25 2020 - 13:36:44 EST


[+cc Dan]

On Wed, Mar 25, 2020 at 01:55:18PM +0000, Shiju Jose wrote:
> The HiSilicon HIP PCIe controller is capable of handling errors
> on root port and perform port reset separately at each root port.
>
> This patch add error handling driver for HIP PCIe controller to log
> and report recoverable errors. Perform root port reset and restore
> link status after the recovery.
>
> Following are some of the PCIe controller's recoverable errors
> 1. completion transmission timeout error.
> 2. CRS retry counter over the threshold error.
> 3. ECC 2 bit errors
> 4. AXI bresponse/rresponse errors etc.
>
> Also fix the following Smatch warning:
> warn: should '((((1))) << (9 + i))' be a 64 bit type?
> if (err->val_bits & BIT(HISI_PCIE_LOCAL_VALID_ERR_MISC + i))
> ^^^ This should be BIT_ULL() because it goes up to 9 + 32.
> Reported-by: kbuild test robot <lkp@xxxxxxxxx>
> Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>

I'm glad you did this fix, and thanks for acknowledging Dan, but I
don't think it's necessary to mention it in the commit log here
because it won't really be useful in the future. It's only relevant
when comparing the unmerged versions of this series, e.g., v4 compared
to v3.

If we were fixing something that's already been merged upstream, we
should absolutely include this, but since this hasn't been merged yet
Dan's report is basically the same as other review comments, which we
normally just address and mention in the change history in the [0/n]
cover letter (as you're already doing, thanks for that!).

Also, I think it's nice to CC: anybody who has commented on previous
versions of the patch series, so I added Dan to the CC: list here.
That way he can chime in if we're not addressing his report correctly.

> Signed-off-by: Yicong Yang <yangyicong@xxxxxxxxxxxxx>
> Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx>
> --
> drivers/pci/controller/Kconfig | 8 +
> drivers/pci/controller/Makefile | 1 +
> drivers/pci/controller/pcie-hisi-error.c | 336
> +++++++++++++++++++++++++++++++
> 3 files changed, 345 insertions(+)
> create mode 100644 drivers/pci/controller/pcie-hisi-error.c

As I mentioned in the other message, I think this file should be
drivers/pci/controller/dwc/pcie-hisi-error.c so it's right next to
pcie-hisi.c. If there's some reason it needs to be here instead,
please mention that in the commit log.

> ---
> drivers/pci/controller/Kconfig | 8 +
> drivers/pci/controller/Makefile | 1 +
> drivers/pci/controller/pcie-hisi-error.c | 357 +++++++++++++++++++++++
> 3 files changed, 366 insertions(+)
> create mode 100644 drivers/pci/controller/pcie-hisi-error.c

> +struct hisi_pcie_err_data {
> + u64 val_bits;
> + u8 version;
> + u8 soc_id;
> + u8 socket_id;
> + u8 nimbus_id;
> + u8 sub_module_id;
> + u8 core_id;
> + u8 port_id;
> + u8 err_severity;
> + u16 err_type;
> + u8 reserv[2];
> + u32 err_misc[HISI_PCIE_ERR_MISC_REGS];
> +};
> +
> +struct hisi_pcie_err_info {
> + struct hisi_pcie_err_data err_data;
> + struct platform_device *pdev;
> +};
> +
> +struct hisi_pcie_err_private {
> + struct notifier_block nb;
> + struct platform_device *pdev;
> +};

Either align all the struct members or none of them. Currently
hisi_pcie_err_data is aligned but hisi_pcie_err_info and
hisi_pcie_err_private are not.

> + /* Call the ACPI handle to reset root port */

Superfluous comment.

> + s = acpi_evaluate_integer(handle, "RST", &arg_list, &data);
> + if (ACPI_FAILURE(s)) {
> + dev_err(dev, "No RST method\n");
> + return -EIO;
> + }

> +static void hisi_pcie_handle_one_error(const struct hisi_pcie_err_data
> *err,
> + struct platform_device *pdev)

Align "struct platform_device ..." under "const struct
hisi_pcie_err_data ...".

Bjorn