Re: [PATCH 2/2] PCI: pciehp: Fix wrong failure check on pcie_capability_read_*()

From: Bjorn Helgaas
Date: Fri Jun 26 2020 - 15:14:28 EST


On Sat, Jun 20, 2020 at 11:09:36AM +0200, Lukas Wunner wrote:
> On Fri, Jun 19, 2020 at 10:12:19PM +0200, refactormyself@xxxxxxxxx wrote:
> > On failure, pcie_capabiility_read_*() will set the status value,
> > its last parameter to 0 and not ~0.
> > This bug fix checks for the proper value.
>
> If a config space read times out, the PCIe controller fabricates
> an "all ones" response. The code is checking for such a timeout,
> not for an error. Hence the code is fine.

In the typical case, the pci_read_config_word() done by
pcie_capability_read_word() will not return an error, so if the read
times out, we should see slot_status == ~0.

But if it's possible to set dev->error_state ==
pci_channel_io_perm_failure, pci_read_config_word() will return an
error because pci_dev_is_disconnected(), so slot_status would be 0.

There are a dozen or so places that set dev->error_state. It doesn't
look *likely* that any of them would cause this, but it doesn't
instill confidence.

It would be a lot nicer if we didn't have to worry about both the 0
and ~0 cases. I keep coming back to the idea of removing the "*val
= 0" code from pcie_capability_read_word() so we wouldn't have that
special case.

In any case, this particular patch doesn't seem like quite the right
fix, so I'll drop it.

> > Signed-off-by: Bolarinwa Olayemi Saheed <refactormyself@xxxxxxxxx>
> > ---
> > drivers/pci/hotplug/pciehp_hpc.c | 10 +++++-----
> > 1 file changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> > index 53433b37e181..c1a67054948a 100644
> > --- a/drivers/pci/hotplug/pciehp_hpc.c
> > +++ b/drivers/pci/hotplug/pciehp_hpc.c
> > @@ -89,7 +89,7 @@ static int pcie_poll_cmd(struct controller *ctrl, int timeout)
> >
> > do {
> > pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &slot_status);
> > - if (slot_status == (u16) ~0) {
> > + if (slot_status == (u16)0) {
> > ctrl_info(ctrl, "%s: no response from device\n",
> > __func__);
> > return 0;
> > @@ -165,7 +165,7 @@ static void pcie_do_write_cmd(struct controller *ctrl, u16 cmd,
> > pcie_wait_cmd(ctrl);
> >
> > pcie_capability_read_word(pdev, PCI_EXP_SLTCTL, &slot_ctrl);
> > - if (slot_ctrl == (u16) ~0) {
> > + if (slot_ctrl == (u16)0) {
> > ctrl_info(ctrl, "%s: no response from device\n", __func__);
> > goto out;
> > }
> > @@ -236,7 +236,7 @@ int pciehp_check_link_active(struct controller *ctrl)
> > int ret;
> >
> > ret = pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
> > - if (ret == PCIBIOS_DEVICE_NOT_FOUND || lnk_status == (u16)~0)
> > + if (ret == PCIBIOS_DEVICE_NOT_FOUND || lnk_status == (u16)0)
> > return -ENODEV;
> >
> > ret = !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
> > @@ -440,7 +440,7 @@ int pciehp_card_present(struct controller *ctrl)
> > int ret;
> >
> > ret = pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &slot_status);
> > - if (ret == PCIBIOS_DEVICE_NOT_FOUND || slot_status == (u16)~0)
> > + if (ret == PCIBIOS_DEVICE_NOT_FOUND || slot_status == (u16)0)
> > return -ENODEV;
> >
> > return !!(slot_status & PCI_EXP_SLTSTA_PDS);
> > @@ -592,7 +592,7 @@ static irqreturn_t pciehp_isr(int irq, void *dev_id)
> >
> > read_status:
> > pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &status);
> > - if (status == (u16) ~0) {
> > + if (status == (u16)0) {
> > ctrl_info(ctrl, "%s: no response from device\n", __func__);
> > if (parent)
> > pm_runtime_put(parent);
> > --
> > 2.18.2