Re: [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events

From: Sathyanarayanan Kuppuswamy
Date: Thu Feb 06 2025 - 23:27:09 EST



On 2/5/25 7:18 PM, Feng Tang wrote:
Hi Sathyanarayanan,

On Wed, Feb 05, 2025 at 10:26:59AM -0800, Sathyanarayanan Kuppuswamy wrote:
On 2/3/25 9:37 PM, Feng Tang wrote:
According to PCIe 6.1 spec, section 6.7.3.2, software need to wait at
least 1 second for the command-complete event, before resending the cmd
or sending a new cmd.

Currently get_port_device_capability() sends slot control cmd to disable
PCIe hotplug interrupts without waiting for its completion and there was
real problem reported for the lack of waiting.
Can you include the error log associated with this issue? What is the
actual issue you are seeing and in which hardware?
For this one, we don't have specific log, as it was raised by firmware
developer, as in https://lore.kernel.org/lkml/Z6LRAozZm1UfgjqT@U-2FWC9VHC-2323.local/

When handling PCI hotplug problem, they hit issue and found their state
machine corrupted , and back traced to OS. They didn't expect to receive
2 link control commands at almost the same time, which doesn't comply to

Which 2 commands from OS? Did you identify both commands?

pcie spec, and normally the handling of one command will take some time
in BIOS, though not as long as 1 second. The HW is an ARM server.

I will try to add these info to commit log in next version.

Ok. Please include it.


Add the necessary wait to comply with PCIe spec. The waiting logic refers
existing pcie_poll_cmd().

Signed-off-by: Feng Tang <feng.tang@xxxxxxxxxxxxxxxxx>
---
drivers/pci/pci.h | 2 ++
drivers/pci/pcie/portdrv.c | 33 +++++++++++++++++++++++++++++++--
2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 01e51db8d285..c1e234d1b81d 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -759,12 +759,14 @@ static inline void pcie_ecrc_get_policy(char *str) { }
#ifdef CONFIG_PCIEPORTBUS
void pcie_reset_lbms_count(struct pci_dev *port);
int pcie_lbms_count(struct pci_dev *port, unsigned long *val);
+void pcie_disable_hp_interrupts_early(struct pci_dev *dev);
#else
static inline void pcie_reset_lbms_count(struct pci_dev *port) {}
static inline int pcie_lbms_count(struct pci_dev *port, unsigned long *val)
{
return -EOPNOTSUPP;
}
+static inline void pcie_disable_hp_interrupts_early(struct pci_dev *dev) {}
#endif
struct pci_dev_reset_methods {
diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
index 02e73099bad0..16010973bfe2 100644
--- a/drivers/pci/pcie/portdrv.c
+++ b/drivers/pci/pcie/portdrv.c
@@ -18,6 +18,7 @@
#include <linux/string.h>
#include <linux/slab.h>
#include <linux/aer.h>
+#include <linux/delay.h>
#include "../pci.h"
#include "portdrv.h"
@@ -205,6 +206,35 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask)
return 0;
}
+static int pcie_wait_sltctl_cmd_raw(struct pci_dev *pdev)
+{
+ u16 slot_status;
+ /* 1000 ms, according toPCIe spec 6.1, section 6.7.3.2 */
+ int timeout = 1000;
+
+ do {
+ pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &slot_status);
+ if (slot_status & PCI_EXP_SLTSTA_CC) {
+ pcie_capability_write_word(pdev, PCI_EXP_SLTSTA,
+ PCI_EXP_SLTSTA_CC);
+ return 0;
+ }
+ msleep(10);
+ timeout -= 10;
+ } while (timeout);
+
+ /* Timeout */
+ return -1;
+}
May be this logic can be simplified using readl_poll_timeout()?
Seems this is what exactly I needed :) Many thanks for the suggestion!

- Feng

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer