Re: [PATCH] PCI: layerscape: Change back to the default error response behavior

From: Bjorn Helgaas
Date: Tue Sep 29 2020 - 11:02:57 EST


On Tue, Sep 29, 2020 at 09:13:28PM +0800, Zhiqiang Hou wrote:
> From: Hou Zhiqiang <Zhiqiang.Hou@xxxxxxx>
>
> In the current error response behavior, it will send a SLVERR response
> to device's internal AXI slave system interface when the PCIe controller
> experiences an erroneous completion (UR, CA and CT) from an external
> completer for its outbound non-posted request, which will result in
> SError and crash the kernel directly.

Possible wording:

As currently configured, when the PCIe controller receives a
Completion with UR or CA status, or a Completion Timeout occurs, it
sends a SLVERR response to the internal AXI slave system interface,
which results in SError and a kernel crash.

Please add a blank line between paragraphs, and
s/This patch change back it/Change it/ below.

> This patch change back it to the default behavior to increase the
> robustness of the kernel. In the default behavior, it always sends an
> OKAY response to the internal AXI slave interface when the controller
> gets these erroneous completions. And the AER driver will report and
> try to recover these errors.

This reverts 84d897d69938 ("PCI: layerscape: Change default error
response behavior"), so please mention that in the commit log,
probably as:

Fixes: 84d897d69938 ("PCI: layerscape: Change default error response behavior")

Maybe it also needs a stable tag, e.g., v4.15+?

Since this is a pure revert, whatever problem 84d897d69938 fixed must
now be fixed in some other way. Otherwise, this revert would just be
reintroducing the problem fixed by 84d897d69938.

This commit log should mention that what that other fix is.

AER is only a reporting mechanism, it is asynchronous to the
instruction stream, and it's optional (may not be implemented in the
hardware, and may not be supported by the kernel), so I'm not super
convinced that it can be the answer to this problem.

> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@xxxxxxx>
> ---
> drivers/pci/controller/dwc/pci-layerscape.c | 11 -----------
> 1 file changed, 11 deletions(-)
>
> diff --git a/drivers/pci/controller/dwc/pci-layerscape.c b/drivers/pci/controller/dwc/pci-layerscape.c
> index f24f79a70d9a..e92ab8a77046 100644
> --- a/drivers/pci/controller/dwc/pci-layerscape.c
> +++ b/drivers/pci/controller/dwc/pci-layerscape.c
> @@ -30,8 +30,6 @@
>
> /* PEX Internal Configuration Registers */
> #define PCIE_STRFMR1 0x71c /* Symbol Timer & Filter Mask Register1 */
> -#define PCIE_ABSERR 0x8d0 /* Bridge Slave Error Response Register */
> -#define PCIE_ABSERR_SETTING 0x9401 /* Forward error of non-posted request */
>
> #define PCIE_IATU_NUM 6
>
> @@ -123,14 +121,6 @@ static int ls_pcie_link_up(struct dw_pcie *pci)
> return 1;
> }
>
> -/* Forward error response of outbound non-posted requests */
> -static void ls_pcie_fix_error_response(struct ls_pcie *pcie)
> -{
> - struct dw_pcie *pci = pcie->pci;
> -
> - iowrite32(PCIE_ABSERR_SETTING, pci->dbi_base + PCIE_ABSERR);
> -}
> -
> static int ls_pcie_host_init(struct pcie_port *pp)
> {
> struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
> @@ -142,7 +132,6 @@ static int ls_pcie_host_init(struct pcie_port *pp)
> * dw_pcie_setup_rc() will reconfigure the outbound windows.
> */
> ls_pcie_disable_outbound_atus(pcie);
> - ls_pcie_fix_error_response(pcie);
>
> dw_pcie_dbi_ro_wr_en(pci);
> ls_pcie_clear_multifunction(pcie);
> --
> 2.17.1
>