Re: [PATCH] PCI: s390: Handle ARI on bus without associated struct pci_dev

From: Niklas Schnelle
Date: Tue Jul 30 2024 - 15:59:39 EST


On Tue, 2024-07-30 at 21:36 +0200, Niklas Schnelle wrote:
> On s390 PCI busses are virtualized and the downstream ports are
> invisible to the OS and struct pci_bus::self is NULL. This associated
> struct pci_dev is however relied upon in pci_ari_enabled() to check
> whether ARI is enabled for the bus. ARI is therefor always detected as
> disabled.
>
> At the same time firmware on s390 always enables and relies upon ARI
> thus causing a mismatch. Moreover with per-PCI function pass-through
> there may exist busses with no function with devfn 0. For example
> a SR-IOV capable device with two PFs may have separate function
> dependency link chains for each of the PFs and their child VFs. In this
> case the OS may only see the second PF and its child VFs on a bus
> without a devfn 0 function. A situation which is also not supported by
> the common pci_configure_ari() code.
>
> Dispite simply being a mismatch this causes problems as some PCI devices
> present a different SR-IOV topology depending on PCI_SRIOV_CTRL_ARI.
>
> A similar mismatch may occur with SR-IOV when virtfn_add_bus() creates new
> busses with no associated struct pci_dev. Here too pci_ari_enabled()
> on these busses would return false even if ARI is actually used.
>
> Prevent both mismatches by moving the ari_enabled flag from struct
> pci_dev to struct pci_bus making it independent from struct pci_bus::
> self. Let the bus inherit the ari_enabled state from its parent bus when
> there is no bridge device such that busses added by virtfn_add_bus()
> match their parent. For s390 set ari_enabled when the device supports
> ARI in the awareness that all PCIe ports on s390 systems are ARI
> capable.
>
> Signed-off-by: Niklas Schnelle <schnelle@xxxxxxxxxxxxx>
> ---
> arch/s390/pci/pci_bus.c | 12 ++++++++++++
> drivers/pci/pci.c | 4 ++--
> drivers/pci/probe.c | 1 +
> include/linux/pci.h | 4 ++--
> 4 files changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/arch/s390/pci/pci_bus.c b/arch/s390/pci/pci_bus.c
> index daa5d7450c7d..021319438dad 100644
> --- a/arch/s390/pci/pci_bus.c
> +++ b/arch/s390/pci/pci_bus.c
> @@ -278,6 +278,18 @@ void pcibios_bus_add_device(struct pci_dev *pdev)
> {
> struct zpci_dev *zdev = to_zpci(pdev);
>
> + /*
> + * On s390 PCI busses are virtualized and the bridge
> + * devices are invisible to the OS. Furthermore busses
> + * may exist without a devfn 0 function. Thus the normal
> + * ARI detection does not work. At the same time fw/hw
> + * has always enabled ARI when possible. Reflect the actual
> + * state by setting ari_enabled whenever a device on the bus
> + * supports it.
> + */
> + if (pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ARI))
> + zdev->zbus->bus->ari_enabled = 1;
> +

@Bjorn unstead of adding the above code to s390 specific code an
alternative I considered would be to modify pci_configure_ari() like
below. I tested this as well but wasn't sure if it is too much churn
especially the handling of the dev->devfn != 0 case. Then again it
might be nice to have this in common code.

@@ -3523,12 +3524,18 @@ void pci_configure_ari(struct pci_dev *dev)
u32 cap;
struct pci_dev *bridge;

- if (pcie_ari_disabled || !pci_is_pcie(dev) || dev->devfn)
+ if (pcie_ari_disabled || !pci_is_pcie(dev))
+ return;
+
+ if (dev->devfn && !hypervisor_isolated_pci_functions())
return;

bridge = dev->bus->self;
- if (!bridge)
+ if (!bridge) {
+ if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ARI))
+ dev->bus->ari_enabled = 1;
return;
+ }

pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap);
if (!(cap & PCI_EXP_DEVCAP2_ARI))