Re: [PATCH] pci: add pci_dev_is_alive API

From: Bjorn Helgaas
Date: Wed May 26 2021 - 12:23:11 EST


On Wed, May 26, 2021 at 02:12:38PM +0800, Lambert Wang wrote:
> ...
> The user is our new PCI driver under development for WWAN devices .
> Surprise removal could happen under multiple circumstances.
> e.g. Exception, Link Failure, etc.
>
> We wanted this API to detect surprise removal or check device recovery
> when AER and Hotplug are disabled.
>
> I thought the API could be commonly used for many similar devices.

Be careful with this. pci_device_is_present() is not a good way to
detect surprise removal. Surprise removal can happen at any time, for
example, it can occur after you call pci_device_is_present() but
before you use the result:

present = pci_device_is_present(pdev);
/* present == true */
/* device may be removed here */
if (present)
xxx; /* this operation may fail */

You have to assume that *any* operation on the device can fail because
the device has been removed. In general, there's no response for a
PCIe write to the device, so you can't really check whether a write
has failed.

There *are* responses for reads, of course, if the device has been
removed, a read will cause a failure response. Most PCIe controllers
turn that response into ~0 data to satisfy the read. So the only
reliable way to detect surprise removal is to check for ~0 data when
doing an MMIO read from the device. Of course, ~0 may be either valid
data or a symptom of a failure response, so you may have to do
additional work to distinguish those two cases.

Bjorn