Re: [PATCH v2] PCI: endpoint: Use blocking notifier instead of atomic

From: Manivannan Sadhasivam
Date: Wed Mar 23 2022 - 14:09:49 EST


Hi Kishon,

On Wed, Mar 09, 2022 at 10:07:47AM +0530, Kishon Vijay Abraham I wrote:
> Hi Mani,
>
> On 28/02/22 11:58 am, Manivannan Sadhasivam wrote:
> > Hi,
> >
> > On Mon, Feb 28, 2022 at 11:46:52AM +0530, Kishon Vijay Abraham I wrote:
> >> Hi Manivannan,
> >>
> >> On 28/02/22 11:22 am, Manivannan Sadhasivam wrote:
> >>> The use of atomic notifier causes sleeping in atomic context bug when
> >>> the EPC core functions are used in the notifier chain. This is due to the
> >>> use of epc->lock (mutex) in core functions protecting the concurrent use of
> >>> EPC.
> >>
> >> The notification from the controller to the function driver is used for
> >> propagating interrupts to function driver and should be in interrupt context.
> >> How it should be handled maybe left to the function driver. I don't prefer
> >> moving everything to blocking notifier.
> >>
> >
> > I agree that we need to handle it quick enough but I don't see any other valid
> > options to get rid of the issue. EPF driver may use a non-atomic notifier but
> > that seems to be an overkill workaround for something that could be fixed in the
> > EPC core.
> >
> > And propagating interrupts is not going to work or needed all the time. Do you
> > forsee any issue with blocking notifier?
>
> I think any interrupt to the EP should be delivered to the function driver in
> interrupt context, it could be function level reset interrupt, hot reset
> interrupt, link state interrupt etc., These are right now not supported but it
> will use the same notification mechanism to propagate interrupt from controller
> driver to function driver.
>

In mainline, I can see only 2 users of this notifier:

1. pcie-tegra194
2. pcie-qcom-ep

In both drivers, CORE_INIT is called from a threaded irq handler so it is not
running in interrupt context. And the CORE_INIT of pci-epf-test driver is
calling EPC functions that could potentially sleep.

For LINK_UP, tegra driver is calling it from hard irq handler but the LINK_UP
of pci-epf-test driver is queueing up the delayed work. In the qcom driver,
LINK_UP is called from a threaded irq handler.

In both cases I don't see any necessity to use the atomic notifier chain.
I agree with you that the notification need to be passed from EPC to EPF asap,
but I'm not sure if it really has to be atomic.

Thanks,
Mani

> Thanks,
> Kishon
>
> >
> >> I'm wondering how other users for CORE_INIT didn't see this issue.
> >
> > This can be triggered with EPF test or NTB if CONFIG_DEBUG_ATOMIC_SLEEP is
> > enabled.
> >
> > Thanks,
> > Mani
> >
> >>
> >> Thanks,
> >> Kishon
> >>
> >>>
> >>> So switch to blocking notifier for getting rid of the bug as it runs in
> >>> non-atomic context and allows sleeping in notifier chain.
> >>>
> >>> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
> >>> ---
> >>>
> >>> Changes in v2:
> >>>
> >>> * Removed the changes related to non-upstreamed patches
> >>>
> >>> drivers/pci/endpoint/pci-epc-core.c | 6 +++---
> >>> include/linux/pci-epc.h | 4 ++--
> >>> 2 files changed, 5 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
> >>> index 3bc9273d0a08..c4347f472618 100644
> >>> --- a/drivers/pci/endpoint/pci-epc-core.c
> >>> +++ b/drivers/pci/endpoint/pci-epc-core.c
> >>> @@ -693,7 +693,7 @@ void pci_epc_linkup(struct pci_epc *epc)
> >>> if (!epc || IS_ERR(epc))
> >>> return;
> >>>
> >>> - atomic_notifier_call_chain(&epc->notifier, LINK_UP, NULL);
> >>> + blocking_notifier_call_chain(&epc->notifier, LINK_UP, NULL);
> >>> }
> >>> EXPORT_SYMBOL_GPL(pci_epc_linkup);
> >>>
> >>> @@ -710,7 +710,7 @@ void pci_epc_init_notify(struct pci_epc *epc)
> >>> if (!epc || IS_ERR(epc))
> >>> return;
> >>>
> >>> - atomic_notifier_call_chain(&epc->notifier, CORE_INIT, NULL);
> >>> + blocking_notifier_call_chain(&epc->notifier, CORE_INIT, NULL);
> >>> }
> >>> EXPORT_SYMBOL_GPL(pci_epc_init_notify);
> >>>
> >>> @@ -774,7 +774,7 @@ __pci_epc_create(struct device *dev, const struct pci_epc_ops *ops,
> >>>
> >>> mutex_init(&epc->lock);
> >>> INIT_LIST_HEAD(&epc->pci_epf);
> >>> - ATOMIC_INIT_NOTIFIER_HEAD(&epc->notifier);
> >>> + BLOCKING_INIT_NOTIFIER_HEAD(&epc->notifier);
> >>>
> >>> device_initialize(&epc->dev);
> >>> epc->dev.class = pci_epc_class;
> >>> diff --git a/include/linux/pci-epc.h b/include/linux/pci-epc.h
> >>> index a48778e1a4ee..04a2e74aed63 100644
> >>> --- a/include/linux/pci-epc.h
> >>> +++ b/include/linux/pci-epc.h
> >>> @@ -149,7 +149,7 @@ struct pci_epc {
> >>> /* mutex to protect against concurrent access of EP controller */
> >>> struct mutex lock;
> >>> unsigned long function_num_map;
> >>> - struct atomic_notifier_head notifier;
> >>> + struct blocking_notifier_head notifier;
> >>> };
> >>>
> >>> /**
> >>> @@ -195,7 +195,7 @@ static inline void *epc_get_drvdata(struct pci_epc *epc)
> >>> static inline int
> >>> pci_epc_register_notifier(struct pci_epc *epc, struct notifier_block *nb)
> >>> {
> >>> - return atomic_notifier_chain_register(&epc->notifier, nb);
> >>> + return blocking_notifier_chain_register(&epc->notifier, nb);
> >>> }
> >>>
> >>> struct pci_epc *
> >>>