Re: [PATCH v2 08/11] driver core: fw_devlink: Make cycle detection more robust

From: Saravana Kannan
Date: Sat Jan 28 2023 - 02:35:30 EST


On Fri, Jan 27, 2023 at 1:43 AM Andy Shevchenko
<andriy.shevchenko@xxxxxxxxxxxxxxx> wrote:
>
> On Thu, Jan 26, 2023 at 04:11:35PM -0800, Saravana Kannan wrote:
> > fw_devlink could only detect a single and simple cycle because it relied
> > mainly on device link cycle detection code that only checked for cycles
> > between devices. The expectation was that the firmware wouldn't have
> > complicated cycles and multiple cycles between devices. That expectation
> > has been proven to be wrong.
> >
> > For example, fw_devlink could handle:
> >
> > +-+ +-+
> > |A+------> |B+
> > +-+ +++
> > ^ |
> > | |
> > +----------+
> >
> > But it couldn't handle even something as "simple" as:
> >
> > +---------------------+
> > | |
> > v |
> > +-+ +-+ +++
> > |A+------> |B+------> |C|
> > +-+ +++ +-+
> > ^ |
> > | |
> > +----------+
> >
> > But firmware has even more complicated cycles like:
> >
> > +---------------------+
> > | |
> > v |
> > +-+ +---+ +++
> > +--+A+------>| B +-----> |C|<--+
> > | +-+ ++--+ +++ |
> > | ^ | ^ | |
> > | | | | | |
> > | +---------+ +---------+ |
> > | |
> > +------------------------------+
> >
> > And this is without including parent child dependencies or nodes in the
> > cycle that are just firmware nodes that'll never have a struct device
> > created for them.
> >
> > The proper way to treat these devices it to not force any probe ordering
> > between them, while still enforce dependencies between node in the
> > cycles (A, B and C) and their consumers.
> >
> > So this patch goes all out and just deals with all types of cycles. It
> > does this by:
> >
> > 1. Following dependencies across device links, parent-child and fwnode
> > links.
> > 2. When it find cycles, it mark the device links and fwnode links as
> > such instead of just deleting them or making the indistinguishable
> > from proxy SYNC_STATE_ONLY device links.
> >
> > This way, when new nodes get added, we can immediately find and mark any
> > new cycles whether the new node is a device or firmware node.
>
> ...
>
> > + * Check if @sup_handle or any of its ancestors or suppliers direct/indirectly
> > + * depend on @con. This function can detect multiple cyles between @sup_handle
>
> A single space is enough.
>
> > + * and @con. When such dependency cycles are found, convert all device links
> > + * created solely by fw_devlink into SYNC_STATE_ONLY device links. Also, mark
>
> Ditto.
>
> > + * all fwnode links in the cycle with FWLINK_FLAG_CYCLE so that when they are
> > + * converted into a device link in the future, they are created as
> > + * SYNC_STATE_ONLY device links. This is the equivalent of doing
>
> Ditto.

Lol, you are the king of nit picks :) I don't know how you even notice
these :) I don't like the double spacing either, but as Geert pointed
out, vim inserts them when I use it to auto word-wrap comment blocks.
I'll try to address them as I find them, but I'm not going to send out
revisions of patches just for double spaces.

>
> > + * fw_devlink=permissive just between the devices in the cycle. We need to do
> > + * this because, at this point, fw_devlink can't tell which of these
> > + * dependencies is not a real dependency.
> > + *
> > + * Return true if one or more cycles were found. Otherwise, return false.
>
> Return:

I'm following the rest of the function docs in this file.

>
> (you may run `kernel-doc -v ...` to see all warnings)

Hmmm I ran it on the patch file and it didn't give me anything useful.
Running it on the whole file is just a lot of lines to dig through.

>
> ...
>
> > +static bool __fw_devlink_relax_cycles(struct device *con,
> > + struct fwnode_handle *sup_handle)
> > +{
> > + struct fwnode_link *link;
> > + struct device_link *dev_link;
>
> > + struct device *sup_dev = NULL, *par_dev = NULL;
>
> You can put it the first line since it's long enough.

Wait, is that a style guideline to have the longer lines first?

> But why do you need sup_dev assignment?

Defensive programming I suppose. I can see this function being
refactored in the future where a goto out; is inserted before sup_dev
is assigned. And then the put_device(sup_dev) at "out" will end up
operating on some junk value and causing memory corruption.

>
> > + bool ret = false;
> > +
> > + if (!sup_handle)
> > + return false;
> > +
> > + /*
> > + * We aren't trying to find all cycles. Just a cycle between con and
> > + * sup_handle.
> > + */
> > + if (sup_handle->flags & FWNODE_FLAG_VISITED)
> > + return false;
> > +
> > + sup_handle->flags |= FWNODE_FLAG_VISITED;
>
> > + sup_dev = get_dev_from_fwnode(sup_handle);
> > +
>
> I would put it closer to the condition:
>
> > + /* Termination condition. */
> > + if (sup_dev == con) {
>
> /* Get supplier device and check for termination condition */
> sup_dev = get_dev_from_fwnode(sup_handle);
> if (sup_dev == con) {

I put it the way it is because sup_dev is used for more than just
checking for termination condition.

>
> > + ret = true;
> > + goto out;
> > + }
> > +
> > + /*
> > + * If sup_dev is bound to a driver and @con hasn't started binding to
> > + * a driver, @sup_dev can't be a consumer of @con. So, no need to
>
> sup_dev or @sup_dev? What's the difference? Should you spell one of them
> in full?

Probably copy-pasta from a function doc. I'll make it sup_dev.

>
> > + * check further.
> > + */
> > + if (sup_dev && sup_dev->links.status == DL_DEV_DRIVER_BOUND &&
>
> As in the comment above, the single space is enough.
>
> > + con->links.status == DL_DEV_NO_DRIVER) {
> > + ret = false;
> > + goto out;
> > + }
> > +
> > + list_for_each_entry(link, &sup_handle->suppliers, c_hook) {
> > + if (__fw_devlink_relax_cycles(con, link->supplier)) {
> > + __fwnode_link_cycle(link);
> > + ret = true;
> > + }
> > + }
> > +
> > + /*
> > + * Give priority to device parent over fwnode parent to account for any
> > + * quirks in how fwnodes are converted to devices.
> > + */
>
> > + if (sup_dev) {
> > + par_dev = sup_dev->parent;
> > + get_device(par_dev);
> > + } else {
> > + par_dev = fwnode_get_next_parent_dev(sup_handle);
> > + }
>
> if (sup_dev)
> par_dev = get_device(sup_dev->parent);
> else
> par_dev = fwnode_get_next_parent_dev(sup_handle);

Ack, thanks. Makes it nicer.

>
> > + if (par_dev)
> > + ret |= __fw_devlink_relax_cycles(con, par_dev->fwnode);
>
> Instead I would rather do a similar pattern of the ret assignment as elsewhere
> in the function.
>
> if (par_dev && __fw_devlink_relax_cycles(con, par_dev->fwnode))
> ret = true;

Ack. Good suggestion!

>
> > + if (!sup_dev)
> > + goto out;
> > +
> > + list_for_each_entry(dev_link, &sup_dev->links.suppliers, c_node) {
> > + /*
> > + * Ignore a SYNC_STATE_ONLY flag only if it wasn't marked as a
> > + * such due to a cycle.
> > + */
> > + if (device_link_flag_is_sync_state_only(dev_link->flags) &&
> > + !(dev_link->flags & DL_FLAG_CYCLE))
> > + continue;
> > +
> > + if (__fw_devlink_relax_cycles(con,
> > + dev_link->supplier->fwnode)) {
>
> Keep it on one line.

It'll make it > 80. Is this some recent change about allowing > 80
cols? I'm leaving it as is for now.


> > + fw_devlink_relax_link(dev_link);
> > + dev_link->flags |= DL_FLAG_CYCLE;
> > + ret = true;
> > + }
> > + }
> > +
> > +out:
> > + sup_handle->flags &= ~FWNODE_FLAG_VISITED;
> > + put_device(sup_dev);
> > + put_device(par_dev);
> > + return ret;
> > +}
>
> --
> With Best Regards,
> Andy Shevchenko
>
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@xxxxxxxxxxx.
>