Re: [PATCH v4] pci: Make return value of pcie_capability_read*() consistent

From: Bjorn Helgaas
Date: Mon Apr 27 2020 - 14:13:09 EST


[+cc Thomas, Michael, linux-mips, linux-ppc, LKML
Background:

- PCI config accessors (pci_read_config_word(), etc) return 0 or a
positive error (PCIBIOS_BAD_REGISTER_NUMBER, etc).

- PCI Express capability accessors (pcie_capability_read_word(),
etc) return 0, a negative error (-EINVAL), or a positive error
(PCIBIOS_BAD_REGISTER_NUMBER, etc).

- The PCI Express case is hard for callers to deal with. The
original plan was to convert this case to either return 0 or
positive errors, just like pci_read_config_word().

- I'm raising the possibility of instead getting rid of the positive
PCIBIOS_* error values completely and replacing them with -EINVAL,
-ENOENT, etc.

- Very few callers check the return codes at all. Most of the ones
that do either check for non-zero or use pcibios_err_to_errno() to
convert PCIBIOS_* to -EINVAL, etc.

I added MIPS and powerpc folks to CC: just as FYI because you're the
biggest users of PCIBIOS_*. The intent is that this would be zero
functional change.
]

On Sun, Apr 26, 2020 at 11:51:30AM +0200, Saheed Bolarinwa wrote:
> On 4/25/20 12:30 AM, Bjorn Helgaas wrote:
> > On Fri, Apr 24, 2020 at 04:27:11PM +0200, Bolarinwa Olayemi Saheed wrote:
> > > pcie_capability_read*() could return 0, -EINVAL, or any of the
> > > PCIBIOS_* error codes (which are positive).
> > > This is behaviour is now changed to return only PCIBIOS_* error
> > > codes on error.
> > > This is consistent with pci_read_config_*(). Callers can now have
> > > a consistent way for checking which error has occurred.
> > >
> > > An audit of the callers of this function was made and no case was found
> > > where there is need for a change within the caller function or their
> > > dependencies down the heirarchy.
> > > Out of all caller functions discovered only 8 functions either persist the
> > > return value of pcie_capability_read*() or directly pass on the return
> > > value.
> > >
> > > 1.) "./drivers/infiniband/hw/hfi1/pcie.c" :
> > > => pcie_speeds() line-306
> > >
> > > if (ret) {
> > > dd_dev_err(dd, "Unable to read from PCI config\n");
> > > return ret;
> > > }
> > >
> > > remarks: The variable "ret" is the captured return value.
> > > This function passes on the return value. The return value was
> > > store only by hfi1_init_dd() line-15076 in
> > > ./drivers/infiniband/hw/hfi1/chip.c and it behave the same on all
> > > errors. So this patch will not require a change in this function.
> > Thanks for the analysis, but I don't think it's quite complete.
> > Here's the call chain I see:
> >
> > local_pci_probe
> > pci_drv->probe(..)
> > init_one # hfi1_pci_driver.probe method
> > hfi1_init_dd
> > pcie_speeds
> > pcie_capability_read_dword
>
> Thank you for pointing out the call chain. After checking it, I noticed that
> the
>
> error is handled within the chain in two places without being passed on.
>
> 1. init_one() in ./drivers/infiniband/hw/hfil1/init.c
>
>  ret = hfi1_init_dd(dd);
>         if (ret)
>                 goto clean_bail; /* error already printed */
>
> ...
> clean_bail:
>         hfi1_pcie_cleanup(pdev);  /*EXITS*/
>
> 2. hfi1_init_dd() in ./drivers/infiniband/hw/hfil1/chip.c
>
>         ret = pcie_speeds(dd);
>         if (ret)
>                 goto bail_cleanup;
>
>         ...
>
>         bail_cleanup:
>                  hfi1_pcie_ddcleanup(dd);  /*EXITS*/
>
> > If pcie_capability_read_dword() returns any non-zero value, that value
> > propagates all the way up and is eventually returned by init_one().
> > init_one() id called by local_pci_probe(), which interprets:
> >
> > < 0 as failure
> > 0 as success, and
> > > 0 as "success but warn"
> >
> > So previously an error from pcie_capability_read_dword() could cause
> > either failure or "success but warn" for the probe method, and after
> > this patch those errors will always cause "success but warn".
> >
> > The current behavior is definitely a bug: if
> > pci_bus_read_config_word() returns PCIBIOS_BAD_REGISTER_NUMBER, that
> > causes pcie_capability_read_dword() to also return
> > PCIBIOS_BAD_REGISTER_NUMBER, which will lead to the probe succeeding
> > with a warning, when it should fail.
> >
> > I think the fix is to make pcie_speeds() call pcibios_err_to_errno():
> >
> > ret = pcie_capability_read_dword(...);
> > if (ret) {
> > dd_dev_err(...);
> > return pcibios_err_to_errno(ret);
> > }
>
> I agree that this fix is needed, so that PCIBIOS_* error code are
> not passed on but replaced
>
> with one consistent with non-PCI error codes.
>
> > That could be its own separate preparatory patch before this
> > adjustment to pcie_capability_read_dword().
> >
> > I didn't look at the other cases below, so I don't know whether
> > they are similar hidden problems.
>
> I will check again, please I will like to clarify if it will be to
> fine to just implement the conversion
>
> (as suggested for pcie_speeds) in all found references, which passes
> on the error code.

I'm starting to think we're approaching this backwards. I searched
for PCIBIOS_FUNC_NOT_SUPPORTED, PCIBIOS_BAD_VENDOR_ID, and the other
error values. Almost every use is a *return* in a config accessor.
There are very, very few *tests* for these values.

For example, the only tests for PCIBIOS_FUNC_NOT_SUPPORTED are in
xen_pcibios_err_to_errno() and pcibios_err_to_errno(), i.e., we're
just converting that value to -ENOENT or the Xen-specific thing.

So I think the best approach might be to remove the PCIBIOS_* error
values completely and replace them with the corresponding values from
pcibios_err_to_errno(). For example, a part of the patch would look
like this:

diff --git a/arch/mips/pci/ops-emma2rh.c b/arch/mips/pci/ops-emma2rh.c
index 65f47344536c..d4d9c902c147 100644
--- a/arch/mips/pci/ops-emma2rh.c
+++ b/arch/mips/pci/ops-emma2rh.c
@@ -100,7 +100,7 @@ static int pci_config_read(struct pci_bus *bus, unsigned int devfn, int where,
break;
default:
emma2rh_out32(EMMA2RH_PCI_IWIN0_CTR, backup_win0);
- return PCIBIOS_FUNC_NOT_SUPPORTED;
+ return -ENOENT;
}

emma2rh_out32(EMMA2RH_PCI_IWIN0_CTR, backup_win0);
@@ -149,7 +149,7 @@ static int pci_config_write(struct pci_bus *bus, unsigned int devfn, int where,
break;
default:
emma2rh_out32(EMMA2RH_PCI_IWIN0_CTR, backup_win0);
- return PCIBIOS_FUNC_NOT_SUPPORTED;
+ return -ENOENT;
}
*(volatile u32 *)(base + (PCI_FUNC(devfn) << 8) +
(where & 0xfffffffc)) = data;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 83ce1cdf5676..f95637a8d391 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -675,7 +675,6 @@ static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev) { return false;

/* Error values that may be returned by PCI functions */
#define PCIBIOS_SUCCESSFUL 0x00
-#define PCIBIOS_FUNC_NOT_SUPPORTED 0x81
#define PCIBIOS_BAD_VENDOR_ID 0x83
#define PCIBIOS_DEVICE_NOT_FOUND 0x86
#define PCIBIOS_BAD_REGISTER_NUMBER 0x87
@@ -689,8 +688,6 @@ static inline int pcibios_err_to_errno(int err)
return err; /* Assume already errno */

switch (err) {
- case PCIBIOS_FUNC_NOT_SUPPORTED:
- return -ENOENT;
case PCIBIOS_BAD_VENDOR_ID:
return -ENOTTY;
case PCIBIOS_DEVICE_NOT_FOUND: