Re: Issues with "PCI/LINK: Report degraded links via link bandwidth notification"

From: Alex G.
Date: Thu Jan 28 2021 - 19:08:26 EST


On 1/28/21 5:51 PM, Sinan Kaya wrote:
On 1/28/2021 6:39 PM, Bjorn Helgaas wrote:
AFAICT, this thread petered out with no resolution.

If the bandwidth change notifications are important to somebody,
please speak up, preferably with a patch that makes the notifications
disabled by default and adds a parameter to enable them (or some other
strategy that makes sense).

I think these are potentially useful, so I don't really want to just
revert them, but if nobody thinks these are important enough to fix,
that's a possibility.

Hide behind debug or expert option by default? or even mark it as BROKEN
until someone fixes it?

Instead of making it a config option, wouldn't it be better as a kernel parameter? People encountering this seem quite competent in passing kernel arguments, so having a "pcie_bw_notification=off" would solve their problems.

As far as marking this as broken, I've seen no conclusive evidence of to tell if its a sw bug or actual hardware problem. Could we have a sysfs to disable this on a per-downstream-port basis?

e.g.
echo 0 > /sys/bus/pci/devices/0000:00:04.0/bw_notification_enabled

This probably won't be ideal if there are many devices downtraining their links ad-hoc. At worst we'd have a way to silence those messages if we do encounter such devices.

Alex