Re: [PATCH V2] PCI: Clear errors logged in Secondary Status Register

From: Vidya Sagar
Date: Mon Apr 01 2024 - 04:00:13 EST


Hi Bjorn,
Just checking on this thread.
Is there anything else you want me to clarify on?

Thanks,
Vidya Sagar

On 14-03-2024 06:09, Vidya Sagar wrote:


On 23-01-2024 04:30, Bjorn Helgaas wrote:
External email: Use caution opening links or attachments


On Tue, Jan 16, 2024 at 08:02:58PM +0530, Vidya Sagar wrote:
The enumeration process leaves the 'Received Master Abort' bit set in
the Secondary Status Register of the downstream port in the following
scenarios.

(1) The device connected to the downstream port has ARI capability
     and that makes the kernel set the 'ARI Forwarding Enable' bit in
     the Device Control 2 Register of the downstream port. This
     effectively makes the downstream port forward the configuration
     requests targeting the devices downstream of it, even though they
     don't exist in reality. It causes the downstream devices return
     completions with UR set in the status in turn causing 'Received
     Master Abort' bit set.

     In contrast, if the downstream device doesn't have ARI capability,
     the 'ARI Forwarding Enable' bit in the downstream port is not set
     and any configuration requests targeting the downstream devices
     that don't exist are terminated (section 6.13 of PCI Express Base
     6.0 spec) in the downstream port itself resulting in no change of
     the 'Received Master Abort' bit.

(2) A PCIe switch is connected to the downstream port and when the
     enumeration flow tries to explore the presence of devices that
     don't really exist downstream of the switch, the downstream
     port receives the completions with UR set causing the 'Received
     Master Abort' bit set.
Are these the only possible ways this error is logged?  I expected
them to be logged when we enumerate below a Root Port that has nothing
attached, for example.
In this case, there won't be any TLP sent downstream. I talked about this scenario in the
second paragraph of point (1) above.
Does clearing them in pci_scan_bridge_extend() cover all ways this
error might be logged during enumeration?  I can't remember whether
all enumeration goes through this path.
So far in my testing, clearing it in pci_scan_bridge_extend() covers all the cases.

Clear 'Received Master Abort' bit to keep the bridge device in a clean
state post enumeration.

Signed-off-by: Vidya Sagar <vidyas@xxxxxxxxxx>
---
V2:
* Changed commit message based on Bjorn's feedback

  drivers/pci/probe.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 795534589b98..640d2871b061 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1470,6 +1470,9 @@ static int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev,
       }

  out:
+     /* Clear errors in the Secondary Status Register */
+     pci_write_config_word(dev, PCI_SEC_STATUS, 0xffff);
+
       pci_write_config_word(dev, PCI_BRIDGE_CONTROL, bctl);

       pm_runtime_put(&dev->dev);
--
2.25.1