On 3/12/2020 2:53 PM, Bjorn Helgaas wrote:No, there is no need for manual intervention even in non hotplug
[EXTERNAL EMAIL]From one of Sathya's other responses:
On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
On 3/11/20 3:23 PM, Bjorn Helgaas wrote:What would the synchronization look like?
Is any synchronization needed here between the EDR path and theIf we want to follow the implementation note step by step (in
hotplug/enumeration path?
sequence) then we need some synchronization between EDR path and
enumeration path. But if it's OK to achieve the same end result by
following steps out of sequence then we don't need to create any
dependency between EDR and enumeration paths. Currently we follow
the latter approach.
Ideally I think it would be better to follow the order in the
flowchart if it's not too onerous. That will make the code easier to
understand. The current situation with this dependency on pciehp and
what it will do leaves a lot of things implicit.
What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
unbinds the drivers and removes the devices. If that doesn't happen,
and Linux clears the DPC trigger to bring the link back up, will those
drivers try to operate uninitialized devices?
Does EDR need a dependency on CONFIG_HOTPLUG_PCI_PCIE?
"If hotplug is not supported then there is support to enumerate
devices via polling or ACPI events. But a point to note
here is, enumeration path is independent of error handler path, and
hence there is no explicit trigger or event from error handler path
to enumeration path to kick start the enumeration."
The EDR standard doesn't have any dependency on hot-plug. It sounds like
in the current implementation there's some manual intervention needed if
hot-plug is not supported?
Ideally recovery would kick in automaticallyyes. we send success code only if the link is trained and on.
but requiring manual intervention is a good first step.
Success in step 2 is assuming device trained and config space isFor example, consider the case in flow chart where after sending
success _OST, firmware decides to stop the recovery of the device.
if we follow the flow chart as is then the steps should be,
1. clear the DPC status trigger
2. Send success code via _OST, and wait for return from _OST
3. if successful return then enumerate the child devices and
reassign bus numbers.
In current approach the steps followed are,
1. Clear the DPC status trigger.
2. Send success code via _OST
accessible correct?
If device was removed or device config space is notyes.
accessible then failure status should be sent via _OST.
--
2. In parallel, LINK UP event path will enumerate the child devices.
3. if firmware decides not to recover the device,Âthen LINK DOWN
event will eventually remove them again.