Re: [PATCH v4 1/2] remoteproc: qcom: Add per subsystem SSR notification

From: Bjorn Andersson
Date: Mon Jun 22 2020 - 14:54:41 EST


On Sat 20 Jun 12:48 PDT 2020, rishabhb@xxxxxxxxxxxxxx wrote:

> On 2020-06-18 16:35, Bjorn Andersson wrote:
> > On Thu 18 Jun 16:00 PDT 2020, Alex Elder wrote:
> >
> > > On 5/27/20 10:34 PM, Rishabh Bhatnagar wrote:
> > > > Currently there is a single notification chain which is called whenever any
> > > > remoteproc shuts down. This leads to all the listeners being notified, and
> > > > is not an optimal design as kernel drivers might only be interested in
> > > > listening to notifications from a particular remoteproc. Create a global
> > > > list of remoteproc notification info data structures. This will hold the
> > > > name and notifier_list information for a particular remoteproc. The API
> > > > to register for notifications will use name argument to retrieve the
> > > > notification info data structure and the notifier block will be added to
> > > > that data structure's notification chain. Also move from blocking notifier
> > > > to srcu notifer based implementation to support dynamic notifier head
> > > > creation.
> > >
> > > I'm looking at these patches now, without having reviewed the
> > > previous versions. Forgive me if I contradict or duplicate
> > > previous feedback.
> > >
> > > I have a number of suggestions, below.
> > >
> > > -Alex
> > >
> >
> > Thanks for your review Alex, some feedback on the patch and your
> > response below.
> >
> > > > Signed-off-by: Siddharth Gupta <sidgup@xxxxxxxxxxxxxx>
> > > > Signed-off-by: Rishabh Bhatnagar <rishabhb@xxxxxxxxxxxxxx>
> > > > ---
> > > > drivers/remoteproc/qcom_common.c | 84 ++++++++++++++++++++++++++++++-----
> > > > drivers/remoteproc/qcom_common.h | 5 ++-
> > > > include/linux/remoteproc/qcom_rproc.h | 20 ++++++---
> > > > 3 files changed, 90 insertions(+), 19 deletions(-)
> > > >
> > > > diff --git a/drivers/remoteproc/qcom_common.c b/drivers/remoteproc/qcom_common.c
> > > > index 9028cea..61ff2dd 100644
> > > > --- a/drivers/remoteproc/qcom_common.c
> > > > +++ b/drivers/remoteproc/qcom_common.c
> > > > @@ -12,6 +12,7 @@
> > > > #include <linux/module.h>
> > > > #include <linux/notifier.h>
> > > > #include <linux/remoteproc.h>
> > > > +#include <linux/remoteproc/qcom_rproc.h>
> > > > #include <linux/rpmsg/qcom_glink.h>
> > > > #include <linux/rpmsg/qcom_smd.h>
> > > > #include <linux/soc/qcom/mdt_loader.h>
> > > > @@ -23,7 +24,14 @@
> > > > #define to_smd_subdev(d) container_of(d, struct qcom_rproc_subdev, subdev)
> > > > #define to_ssr_subdev(d) container_of(d, struct qcom_rproc_ssr, subdev)
> > > > -static BLOCKING_NOTIFIER_HEAD(ssr_notifiers);
> > > > +struct qcom_ssr_subsystem {
> > > > + const char *name;
> > > > + struct srcu_notifier_head notifier_list;
> > > > + struct list_head list;
> > > > +};
> > > > +
> > > > +static LIST_HEAD(qcom_ssr_subsystem_list);
> > > > +DEFINE_MUTEX(qcom_ssr_subsys_lock);
> > >
> > > There is no need for this mutex to be global.
> > >
> > > > static int glink_subdev_start(struct rproc_subdev *subdev)
> > > > {
> > > > @@ -189,39 +197,79 @@ void qcom_remove_smd_subdev(struct rproc *rproc, struct qcom_rproc_subdev *smd)
> > > > }
> > > > EXPORT_SYMBOL_GPL(qcom_remove_smd_subdev);
> > > > +struct qcom_ssr_subsystem *qcom_ssr_get_subsys(const char *name)
> > >
> > > This function should be made private (static).
> > >
> >
> > Yes.
> >
> > > I think the mutex should be taken in this function rather than
> > > the caller (more on this below). But if you leave it this way,
> > > please mention something in a comment that indicates the caller
> > > must hold the qcom_ssr_subsys_lock mutex.
> > >
> >
> > I agree, that would simplify reasoning about the lock.
> >
> > > > +{
> > > > + struct qcom_ssr_subsystem *info;
> > > > +
> > > > + /* Match in the global qcom_ssr_subsystem_list with name */
> > > > + list_for_each_entry(info, &qcom_ssr_subsystem_list, list) {
> > > > + if (!strcmp(info->name, name))
> > > > + return info;
> > >
> > > This probably isn't strictly necessary, because you are
> > > returning a void pointer, but you could do this here:
> > > return ERR_CAST(info);
> >
> > Info is a struct qcom_ssr_subsystem * and that's the function's return
> > type as well, so Rishabh's approach looks correct to me.
> >
> > >
> > > > + }
> > >
> > > This is purely a style thing, but the curly braces around
> > > the loop body aren't necessary.
> > >
> > > > + info = kzalloc(sizeof(*info), GFP_KERNEL);
> > > > + if (!info)
> > > > + return ERR_PTR(-ENOMEM);
> > > > + info->name = kstrdup_const(name, GFP_KERNEL);
> > > > + srcu_init_notifier_head(&info->notifier_list);
> > > > +
> > > > + /* Add to global notif list */
> > >
> > > s/notif/notification/
> > >
> > > > + INIT_LIST_HEAD(&info->list);
> > >
> > > No need to initialize the list element when adding it
> > > to a list. Both uts fields will be overwritten anyway.
> > >
> > > > + list_add_tail(&info->list, &qcom_ssr_subsystem_list);
> > > > +
> > > > + return info;
> > > > +}
> > > > +
> > > > /**
> > > > * qcom_register_ssr_notifier() - register SSR notification handler
> > > > + * @name: name that will be searched in global ssr subsystem list
> > >
> > > Maybe just "SSR subsystem name".
> > >
> > > > * @nb: notifier_block to notify for restart notifications
> > >
> > > Drop or modify "restart" in the comment line above.
> > >
> > > > *
> > > > - * Returns 0 on success, negative errno on failure.
> > > > + * Returns a subsystem cookie on success, ERR_PTR on failure.
> > >
> > > Maybe make the above a @Return: comment.
> > >
> >
> > No @ in that, but "Return: foo" is the appropriate format.
> >
> > > > *
> > > > - * This register the @notify function as handler for restart notifications. As
> > > > - * remote processors are stopped this function will be called, with the SSR
> > > > - * name passed as a parameter.
> > > > + * This registers the @nb notifier block as part the notifier chain for a
> > > > + * remoteproc associated with @name. The notifier block's callback
> > > > + * will be invoked when the particular remote processor is stopped.
> > >
> > > It's not just for stopping, right? Maybe something
> > > more like:
> > > Register to receive notification callbacks when
> > > remoteproc SSR events occur (pre- and post-startup
> > > and pre- and post-shutdown).
> > >
> >
> > And this description of the function should go above the Return:
> >
> > See
> > https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#function-documentation
> >
> > > > */
> > > > -int qcom_register_ssr_notifier(struct notifier_block *nb)
> > > > +void *qcom_register_ssr_notifier(const char *name, struct notifier_block *nb)
> > > > {
> > > > - return blocking_notifier_chain_register(&ssr_notifiers, nb);
> > > > + struct qcom_ssr_subsystem *info;
> > > > +
> > > > + mutex_lock(&qcom_ssr_subsys_lock);
> > >
> > > Can you explain why the mutex is taken here (and in
> > > qcom_add_ssr_subdev()), rather than having the mutex
> > > logic be isolated in qcom_ssr_get_subsys()?
> > >
> > > > + info = qcom_ssr_get_subsys(name);
> > > > + if (IS_ERR(info)) {
> > > > + mutex_unlock(&qcom_ssr_subsys_lock);
> > > > + return info;
> > > > + }
> > >
> > > I don't think there's any need to have the next function
> > > call be protected by the mutex, but maybe I'm mistaken.
> > >
> > > > + srcu_notifier_chain_register(&info->notifier_list, nb);
> > > > + mutex_unlock(&qcom_ssr_subsys_lock);
> > > > + return &info->notifier_list;
> > > > }
> > > > EXPORT_SYMBOL_GPL(qcom_register_ssr_notifier);
> > > > /**
> > > > * qcom_unregister_ssr_notifier() - unregister SSR notification handler
> > > > + * @notify: subsystem coookie returned from qcom_register_ssr_notifier
> > > > * @nb: notifier_block to unregister
> > >
> > > Add a @Return comment (0 on success, %ENOENT otherwise).
> > >
> > > > */
> > > > -void qcom_unregister_ssr_notifier(struct notifier_block *nb)
> > > > +int qcom_unregister_ssr_notifier(void *notify, struct notifier_block *nb)
> > > > {
> > > > - blocking_notifier_chain_unregister(&ssr_notifiers, nb);
> > > > + return srcu_notifier_chain_unregister(notify, nb);
> > > > }
> > > > EXPORT_SYMBOL_GPL(qcom_unregister_ssr_notifier);
> > > > static void ssr_notify_unprepare(struct rproc_subdev *subdev)
> > > > {
> > > > struct qcom_rproc_ssr *ssr = to_ssr_subdev(subdev);
> > > > + struct qcom_ssr_notif_data data = {
> > > > + .name = ssr->info->name,
> > > > + .crashed = false,
> > > > + };
> > > > - blocking_notifier_call_chain(&ssr_notifiers, 0, (void *)ssr->name);
> > > > + srcu_notifier_call_chain(&ssr->info->notifier_list, 0, &data);
> > > > }
> > > > +
> > > > /**
> > > > * qcom_add_ssr_subdev() - register subdevice as restart notification source
> > > > * @rproc: rproc handle
> > > > @@ -229,12 +277,23 @@ static void ssr_notify_unprepare(struct rproc_subdev *subdev)
> > > > * @ssr_name: identifier to use for notifications originating from @rproc
> > > > *
> > > > * As the @ssr is registered with the @rproc SSR events will be sent to all
> > > > - * registered listeners in the system as the remoteproc is shut down.
> > > > + * registered listeners for the particular remoteproc when it is shutdown.
> > > > */
> > > > void qcom_add_ssr_subdev(struct rproc *rproc, struct qcom_rproc_ssr *ssr,
> > > > const char *ssr_name)
> > > > {
> > > > - ssr->name = ssr_name;
> > > > + struct qcom_ssr_subsystem *info;
> > > > +
> > > > + mutex_lock(&qcom_ssr_subsys_lock);
> > > > + info = qcom_ssr_get_subsys(ssr_name);
> > >
> > > If there already exists an SSR subsystem having the given
> > > name, its info structure is returned here. Is that OK?
> > > In practice, I don't expect this to be a problem, but it
> > > would be better to return an error if
> > >
> >
> > You're right...that shouldn't happen. So a WARN_ON() and early return
> > would be in order.
> >
> the info structure needs to be embedded in the qcom_rproc_ssr struct in case
> where clients register for notifications even before that particular ssr
> subdevice is registered. Logically add_ssr_subdev shouldn't happen twice for
> a rproc without doing remove.

You're right, I forgot that part of it. So if anything we would have a
check to see that the given info isn't already associated with a
remoteproc instance. But I don't see a link in that direction, so I'm
fine with ignoring this.

Regards,
Bjorn