Re: [PATCH] PCI: Add link_change error handler and vfio-pci user

From: Alex_Gagniuc
Date: Wed Apr 24 2019 - 17:34:20 EST


On 4/23/2019 5:42 PM, Alex Williamson wrote:
> The PCIe bandwidth notification service generates logging any time a
> link changes speed or width to a state that is considered downgraded.
> Unfortunately, it cannot differentiate signal integrity related link
> changes from those intentionally initiated by an endpoint driver,
> including drivers that may live in userspace or VMs when making use
> of vfio-pci. Therefore, allow the driver to have a say in whether
> the link is indeed downgraded and worth noting in the log, or if the
> change is perhaps intentional.
>
> For vfio-pci, we don't know the intentions of the user/guest driver
> either, but we do know that GPU drivers in guests actively manage
> the link state and therefore trigger the bandwidth notification for
> what appear to be entirely intentional link changes.
>
> Fixes: e8303bb7a75c PCI/LINK: Report degraded links via link bandwidth notification
> Link: https://lore.kernel.org/linux-pci/155597243666.19387.1205950870601742062.stgit@xxxxxxxxxx/T/#u
> Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
> ---
>
> Changing to pci_dbg() logging is not super usable, so let's try the
> previous idea of letting the driver handle link change events as they
> see fit. Ideally this might be two patches, but for easier handling,
> folding the pci and vfio-pci bits together. Comments? Thanks,

I think this callback opens up a can of worms where drivers can ad-hoc
kill a number what otherwise can be indicators of problems. But I don't
have to like it to review it :).

> drivers/pci/probe.c | 13 +++++++++++++
> drivers/vfio/pci/vfio_pci.c | 10 ++++++++++
> include/linux/pci.h | 3 +++
> 3 files changed, 26 insertions(+)
>
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 7e12d0163863..233cd4b5b6e8 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2403,6 +2403,19 @@ void pcie_report_downtraining(struct pci_dev *dev)

I don't think you want to change pcie_report_downtraining(). You're
advertising to "report" something, by nomenclature, but then go around
and also call a notification callback. This is also used during probe,
and you've now just killed your chance to notice you've booted with a
degraded link.
If what you want to do is silence the bandwidth notification, you want
to modify the threaded interrupt that calls this.

> if (PCI_FUNC(dev->devfn) != 0 || dev->is_virtfn)
> return;
>
> + /*
> + * If driver handles link_change event, defer to driver. PCIe drivers
> + * can call pcie_print_link_status() to print current link info.
> + */
> + device_lock(&dev->dev);
> + if (dev->driver && dev->driver->err_handler &&
> + dev->driver->err_handler->link_change) {
> + dev->driver->err_handler->link_change(dev);
> + device_unlock(&dev->dev);
> + return;
> + }
> + device_unlock(&dev->dev);

Can we write this such that there is a single lock()/unlock() pair?

> +
> /* Print link status only if the device is constrained by the fabric */
> __pcie_print_link_status(dev, false);
> }
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index cab71da46f4a..c9ffc0ccabb3 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -1418,8 +1418,18 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
> return PCI_ERS_RESULT_CAN_RECOVER;
> }
>
> +/*
> + * Ignore link change notification, we can't differentiate signal related
> + * link changes from user driver power management type operations, so do
> + * nothing. Potentially this could be routed out to the user.
> + */
> +static void vfio_pci_link_change(struct pci_dev *pdev)
> +{
> +}
> +
> static const struct pci_error_handlers vfio_err_handlers = {
> .error_detected = vfio_pci_aer_err_detected,
> + .link_change = vfio_pci_link_change,
> };
>
> static struct pci_driver vfio_pci_driver = {
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 27854731afc4..e9194bc03f9e 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -763,6 +763,9 @@ struct pci_error_handlers {
>
> /* Device driver may resume normal operations */
> void (*resume)(struct pci_dev *dev);
> +
> + /* PCIe link change notification */
> + void (*link_change)(struct pci_dev *dev);
> };
>
>
>
>