On Thu, Mar 12, 2020 at 02:59:15PM -0700, Kuppuswamy Sathyanarayanan wrote:
Hi Bjorn,
On 3/12/20 12:53 PM, Bjorn Helgaas wrote:
On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:we might need some way to disable the enumeration path till
On 3/11/20 3:23 PM, Bjorn Helgaas wrote:What would the synchronization look like?
Is any synchronization needed here between the EDR path and theIf we want to follow the implementation note step by step (in
hotplug/enumeration path?
sequence) then we need some synchronization between EDR path and
enumeration path. But if it's OK to achieve the same end result by
following steps out of sequence then we don't need to create any
dependency between EDR and enumeration paths. Currently we follow
the latter approach.
we get response from firmware.
In native hot plug case, I think we can do it in two ways.
1. Disable hotplug notification in slot ctl registers.
ÂÂÂ (pcie_disable_notification())
2. Some how block hotplug driver from processing the new
ÂÂÂ events (not sure how feasible its).
Following method 1 would be easy, But I am not sure whether
its alright to disable them randomly. I think, unless we
clear the status as well, we might get some issues due to stale
notification history.
For ACPI event case, I am not sure whether we have some
communication protocol in place to disable receiving ACPI
events temporarily.
For polling model, we need to disable to the polling
timer thread till we receive _OST response from firmware.
None of the above changes will be pretty and I think it will
Ideally I think it would be better to follow the order in the
flowchart if it's not too onerous.
not be simple as well.
That will make the code easier to
understand. The current situation with this dependency on pciehp and
what it will do leaves a lot of things implicit.
What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
unbinds the drivers and removes the devices.
If that doesn't happen, and Linux clears the DPC trigger to bring
the link back up, will those drivers try to operate uninitialized
devices?
I don't think this will happen. In DPC reset_link before we bring up
the device we wait for link to go down first using
pcie_wait_for_link(pdev, false) function.
I understand that, but these child devices were reset when DPC
disabled the link. When the link comes back up, their BARs contain
zeros.
If CONFIG_HOTPLUG_PCI_PCIE=y, the DLLSC interrupt will cause pciehp to
unbind the driver. It seems like the unbind races with the EDR notify
handler. '
pcie_do_recovery(), this seems fine -- we'll call dpc_reset_link(),In the above case, from the kernel perspective device is still accessible and IIUC, it will try to recover it in pcie_do_recovery()
which brings up the link, we won't call any driver callbacks because
there's no driver, and another DLLSC interrupt will cause pciehp to
re-enumerate, which will re-initialize the device, then rebind the
driver.
If the EDR notify handler runs before pciehp unbinds the driver,
couldn't EDR bring up the link and call driver .mmio_enabled() beforeCalling mmio_enabled in this case should not be a problem right ?
the device has been initialized?
If CONFIG_HOTPLUG_PCI_PCIE=n and CONFIG_HOTPLUG_PCI_ACPI=y, I could
believe that the situations are similar to the above.
What if CONFIG_HOTPLUG_PCI_PCIE=n and CONFIG_HOTPLUG_PCI_ACPI=n? Then
I assume there's nothing to unbind the driver, so pcie_do_recovery()
will call the driver .mmio_enabled() and other recovery callbacks on a
device that hasn't been initialized?