Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode

From: Bjorn Helgaas
Date: Thu Mar 12 2020 - 18:36:55 EST


On Thu, Mar 12, 2020 at 03:02:07PM -0700, Kuppuswamy Sathyanarayanan wrote:
> On 3/12/20 2:52 PM, Bjorn Helgaas wrote:
> > On Thu, Mar 12, 2020 at 02:29:58PM -0700, Kuppuswamy Sathyanarayanan wrote:
> > > On 3/12/20 2:02 PM, Austin.Bolen@xxxxxxxx wrote:
> > > > On 3/12/2020 2:53 PM, Bjorn Helgaas wrote:
> > > > > On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
> > > > > > On 3/11/20 3:23 PM, Bjorn Helgaas wrote:
> > > > > > > Is any synchronization needed here between the EDR path and the
> > > > > > > hotplug/enumeration path?
> > > > > > If we want to follow the implementation note step by step (in
> > > > > > sequence) then we need some synchronization between EDR path and
> > > > > > enumeration path. But if it's OK to achieve the same end result by
> > > > > > following steps out of sequence then we don't need to create any
> > > > > > dependency between EDR and enumeration paths. Currently we follow
> > > > > > the latter approach.
> > > > > What would the synchronization look like?
> > > > >
> > > > > Ideally I think it would be better to follow the order in the
> > > > > flowchart if it's not too onerous. That will make the code easier to
> > > > > understand. The current situation with this dependency on pciehp and
> > > > > what it will do leaves a lot of things implicit.
> > > > >
> > > > > What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
> > > > >
> > > > > IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
> > > > > unbinds the drivers and removes the devices. If that doesn't happen,
> > > > > and Linux clears the DPC trigger to bring the link back up, will those
> > > > > drivers try to operate uninitialized devices?
> > > > >
> > > > > Does EDR need a dependency on CONFIG_HOTPLUG_PCI_PCIE?
> > > > From one of Sathya's other responses:
> > > >
> > > > "If hotplug is not supported then there is support to enumerate
> > > > devices via polling or ACPI events. But a point to note
> > > > here is, enumeration path is independent of error handler path, and
> > > > hence there is no explicit trigger or event from error handler path
> > > > to enumeration path to kick start the enumeration."
> > > >
> > > > The EDR standard doesn't have any dependency on hot-plug. It sounds like
> > > > in the current implementation there's some manual intervention needed if
> > > > hot-plug is not supported?
> > >
> > > No, there is no need for manual intervention even in non hotplug
> > > cases.
> > >
> > > For ACPI events case, we would rely on ACPI event to kick start the
> > > enumeration.  And for polling model, there is an independent polling
> > > thread which will kick start the enumeration.
>
> > I'm guessing the ACPI case works via hotplug_is_native(): if
> > CONFIG_HOTPLUG_PCI_PCIE=n, pciehp_is_native() returns false, and
> > acpiphp manages hotplug.
> >
> > What if CONFIG_HOTPLUG_PCI_ACPI=n also?
>
> If none of the auto scans are enabled then we might need some
> manual trigger (rescan). But this would be needed in native
> DPC case as well.
> >
> > Where is the polling thread?
>
> drivers/pci/hotplug/pciehp_hpc.c

Only if CONFIG_HOTPLUG_PCI_PCIE=y, obviously. My question is about
what happens when CONFIG_HOTPLUG_PCI_PCIE=n.

I'm not as concerned about requiring a manual rescan. That's
inconvenient, but doesn't seem like a big deal because that's what you
expect with no hotplug driver.

What I *am* worried about is calling driver callbacks on a device that
has been reset but not initialized. That could cause all sorts of
havoc because the driver thinks it can trust BARs and other
configuration.