Re: [PATCH] PCI: aardvark: Implement re-issuing config requests on CRS response

From: Bjorn Helgaas
Date: Thu Sep 16 2021 - 10:34:46 EST


On Wed, Sep 15, 2021 at 12:55:53PM +0200, Pali Rohár wrote:
> On Tuesday 14 September 2021 15:55:26 Bjorn Helgaas wrote:

> > It is illegal for a device to return CRS after it has returned a
> > successful completion unless an intervening reset has occurred, so
> > drivers and other code should never see it.
> >
> > > And issue is there also with write requests. Is somebody checking return
> > > value of pci_bus_write_config function?
> >
> > Similar case here. The enumeration and wait-after-reset paths always
> > do *reads* until we get a successful completion, so I don't think we
> > ever issue a write that can get CRS.
>
> Yes, in normal conditions we should not see it.
>
> But for testing purposes (that emulated bridge works fine) I'm using
> setpci for changing some configuration.
>
> And via setpci it is possible to turn off CRSSVE bit in which case then
> Root Complex should re-issue request again.
>
> I'm not sure how "legal" it is if userspace / setpci changes some of
> these bits. At least on a hardware with a real Root Port device it
> should be fully transparent. As hardware handles this re-issue and
> kernel then would see (reissued) response.

If setpci changes bits like these, all bets are off. We can't tell
what happened, so we can't rely on any configuration Linux did. I
think we really should taint the kernel when this happens.

> Test case: Initialize device, then unbind it from sysfs, reset it (hot
> reset or warm reset) and then rescan / reinit it again. Here device is
> permitted to send CRS response.
>
> We know that more PCIe cards are buggy and sometimes firmware on cards
> crashes or resets card logic. Which may put card into initialization
> state when it is again permitted to send CRS response.

Yep. That's a buggy device and normally we would work around it with
a quirk. This particular kind of bug would be hard to work around,
but a host bridge driver doesn't seem like the right place to do it
because we'd have to do it in *every* such driver.

Bjorn