[+cc Russell, Sam, Oliver since we're talking about the error recoveryAfter our discussion about non-hotplug cases, I am thinking
flow. The code we're talking about is at [1]]
On Thu, Mar 12, 2020 at 11:22:13PM -0700, Kuppuswamy, Sathyanarayanan wrote:
On 3/12/2020 3:32 PM, Bjorn Helgaas wrote:That might be what *should* happen, but I don't think it's what
On Thu, Mar 12, 2020 at 02:59:15PM -0700, Kuppuswamy Sathyanarayanan wrote:Agree. But even if there is a race condition, after clearing DPC
On 3/12/20 12:53 PM, Bjorn Helgaas wrote:I understand that, but these child devices were reset when DPC
On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:we might need some way to disable the enumeration path till
On 3/11/20 3:23 PM, Bjorn Helgaas wrote:What would the synchronization look like?
Is any synchronization needed here between the EDR path and theIf we want to follow the implementation note step by step (in
hotplug/enumeration path?
sequence) then we need some synchronization between EDR path and
enumeration path. But if it's OK to achieve the same end result by
following steps out of sequence then we don't need to create any
dependency between EDR and enumeration paths. Currently we follow
the latter approach.
we get response from firmware.
In native hot plug case, I think we can do it in two ways.
1. Disable hotplug notification in slot ctl registers.
ÂÂÂ (pcie_disable_notification())
2. Some how block hotplug driver from processing the new
ÂÂÂ events (not sure how feasible its).
Following method 1 would be easy, But I am not sure whether
its alright to disable them randomly. I think, unless we
clear the status as well, we might get some issues due to stale
notification history.
For ACPI event case, I am not sure whether we have some
communication protocol in place to disable receiving ACPI
events temporarily.
For polling model, we need to disable to the polling
timer thread till we receive _OST response from firmware.
Ideally I think it would be better to follow the order in theNone of the above changes will be pretty and I think it will
flowchart if it's not too onerous.
not be simple as well.
That will make the code easier toI don't think this will happen. In DPC reset_link before we bring up
understand. The current situation with this dependency on pciehp and
what it will do leaves a lot of things implicit.
What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
unbinds the drivers and removes the devices.
If that doesn't happen, and Linux clears the DPC trigger to bring
the link back up, will those drivers try to operate uninitialized
devices?
the device we wait for link to go down first using
pcie_wait_for_link(pdev, false) function.
disabled the link. When the link comes back up, their BARs
contain zeros.
If CONFIG_HOTPLUG_PCI_PCIE=y, the DLLSC interrupt will cause
pciehp to unbind the driver. It seems like the unbind races with
the EDR notify handler.
trigger status, if hotplug driver properly removes/re-enumerates the
driver then the end result will still be same. There should be no
functional impact.
If pciehp unbinds the driver before edr_handle_event() callsIn the above case, from the kernel perspective device is still
pcie_do_recovery(), this seems fine -- we'll call
dpc_reset_link(), which brings up the link, we won't call any
driver callbacks because there's no driver, and another DLLSC
interrupt will cause pciehp to re-enumerate, which will
re-initialize the device, then rebind the driver.
If the EDR notify handler runs before pciehp unbinds the driver,
accessible and IIUC, it will try to recover it in pcie_do_recovery()
using one of the callbacks.
int (*mmio_enabled)(struct pci_dev *dev);
int (*slot_reset)(struct pci_dev *dev);
void (*resume)(struct pci_dev *dev);
One of these callbacks will do pci_restore_state() to restore the
device, and IO will not attempted in these callbacks until the device
is successfully recovered.
*does* happen.
I don't think we use .mmio_enabled() and .slot_reset() for EDR
because Linux EDR currently depends on DPC, so we'll be using
dpc_reset_link(), which normally returns PCI_ERS_RESULT_RECOVERED,
so pcie_do_recovery() skips .mmio_enabled() and .slot_reset().
Yes. I have tested it only with hotplug enabled. Let me try to disable
I looked at the first few .resume() implementations (FWIW, I used [2]
to find them), and none of them calls pci_restore_state() before doing
I/O to the device:
ioat_pcie_error_resume()
pci_resume() (hfi1)
qib_pci_resume()
cxl_pci_resume()
genwqe_err_resume()
...
But I assume you've tested EDR with some driver that *does* call
pci_restore_state()? Or maybe you have pciehp enabled,
and it alwaysIn hotplug case, it is possible. since reset_link() is called before
wins the race and unbinds the driver before the EDR notification? It
would be interesting to make pciehp *lose* the race and see if
anything breaks.
pci-error-recovery.rst does not mention any requirement for the driver
to call pci_restore_state(), and I think any state restoration like
that should be the responsibility of the PCI core, not the driver.
I don't think .mmio_enabled() is a problem because IIUC, the devicecouldn't EDR bring up the link and call driver .mmio_enabled() beforeCalling mmio_enabled in this case should not be a problem right?
the device has been initialized?
Please check the following content from
Documentation/PCI/pci-error-recovery.rst. IIUC (following content),
IO will not be attempted until the device is successfully
re-configured.
STEP 2: MMIO Enabled
--------------------
This callback is made if all drivers on a segment agree that they can
try to recover and if no automatic link reset was performed by the HW.
If the platform can't just re-enable IOs without a slot reset or a link
reset, it will not call this callback, and instead will have gone
directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
If CONFIG_HOTPLUG_PCI_PCIE=n and CONFIG_HOTPLUG_PCI_ACPI=y, I couldprobably in .slot_reset() callback device config will be restored and it
believe that the situations are similar to the above.
What if CONFIG_HOTPLUG_PCI_PCIE=n and CONFIG_HOTPLUG_PCI_ACPI=n? Then
I assume there's nothing to unbind the driver, so pcie_do_recovery()
will call the driver .mmio_enabled() and other recovery callbacks on a
device that hasn't been initialized?
will make the device functional again.
should not have been reset before calling .mmio_enabled().
Yes. Agree. May be the documentation needs to be explicit about it ?
But I think .slot_reset() *is* a problem. I looked at several
.slot_reset() implementations ([3]); some called pci_restore_state(),
but many did not.
If no hotplug driver is enabled, I think the .slot_reset() callbacks
that do not call pci_restore_state() are broken.
Also since in above case hotplug is not supported, topology change will[1] https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/log/?h=review/edr
not be supported.
[2] F='\.resume'; git grep -A 10 "struct pci_error_handlers" | grep "$F\s*=" | sed -e "s/.*$F\s*=\s*//" -e 's/,\s*$//'
[3] F='\.slot_reset'; git grep -A 10 "struct pci_error_handlers" | grep "$F\s*=" | sed -e "s/.*$F\s*=\s*//" -e 's/,\s*$//'