Re: [PATCH] Revert "PCI/LINK: Report degraded links via link bandwidth notification"

From: Bjorn Helgaas
Date: Tue Apr 30 2019 - 22:13:07 EST


On Tue, Apr 30, 2019 at 12:18:13PM -0600, Keith Busch wrote:
> On Tue, Apr 30, 2019 at 12:05:09PM -0600, Keith Busch wrote:
> > On Tue, Apr 30, 2019 at 11:11:51AM -0500, Bjorn Helgaas wrote:
> > > > I'm not convinced a revert is the best call.
> > >
> > > I have very limited options at this stage of the release, but I'd be
> > > glad to hear suggestions. My concern is that if we release v5.1
> > > as-is, we'll spend a lot of energy on those false positives.
> >
> > May be too late now if the revert is queued up, but I think this feature
> > should have been a default 'false' Kconfig bool rather than always on.

Since this feature currently just adds a message in dmesg, which we
don't really consider a stable API, I think a Kconfig switch is a
reasonable option.

If you send me a signed-off-by for the following patch, I can apply it:

commit 302b77157e66
Author: Keith Busch <kbusch@xxxxxxxxxx>
Date: Tue Apr 30 12:18:13 2019 -0600

PCI/LINK: Add Kconfig option (default off)

e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth
notification") added dmesg logging whenever a link changes speed or width
to a state that is considered degraded. Unfortunately, it cannot
differentiate signal integrity-related link changes from those
intentionally initiated by an endpoint driver, including drivers that may
live in userspace or VMs when making use of vfio-pci. Some GPU drivers
actively manage the link state to save power, which generates a stream of
messages like this:

vfio-pci 0000:07:00.0: 32.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s x16 link at 0000:00:02.0 (capable of 64.000 Gb/s with 5 GT/s x16 link)

Since we can't distinguish the intentional changes from the signal
integrity issues, leave the reporting turned off by default. Add a Kconfig
option to turn it on if desired.

Fixes: e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth
notification")

diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig
index 5cbdbca904ac..4a094f0d2856 100644
--- a/drivers/pci/pcie/Kconfig
+++ b/drivers/pci/pcie/Kconfig
@@ -142,3 +142,12 @@ config PCIE_PTM

This is only useful if you have devices that support PTM, but it
is safe to enable even if you don't.
+
+config PCIE_BW
+ bool "PCI Express Bandwidth Change Notification"
+ default n
+ depends on PCIEPORTBUS
+ help
+ This enables PCI Express Bandwidth Change Notification. If
+ you know link width or rate changes occur only to correct
+ unreliable links, you may answer Y.
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index f1d7bc1e5efa..d356a5bdb158 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -3,7 +3,6 @@
# Makefile for PCI Express features and port driver

pcieportdrv-y := portdrv_core.o portdrv_pci.o err.o
-pcieportdrv-y += bw_notification.o

obj-$(CONFIG_PCIEPORTBUS) += pcieportdrv.o

@@ -13,3 +12,4 @@ obj-$(CONFIG_PCIEAER_INJECT) += aer_inject.o
obj-$(CONFIG_PCIE_PME) += pme.o
obj-$(CONFIG_PCIE_DPC) += dpc.o
obj-$(CONFIG_PCIE_PTM) += ptm.o
+obj-$(CONFIG_PCIE_BW) := bw_notification.o