Re: [PATCH v1 0/5] Enable fw_devlink=on by default

From: Saravana Kannan
Date: Wed Feb 10 2021 - 19:03:53 EST


On Thu, Jan 28, 2021 at 7:03 AM Jon Hunter <jonathanh@xxxxxxxxxx> wrote:
>
>
> On 14/01/2021 16:56, Jon Hunter wrote:
> >
> > On 14/01/2021 16:47, Saravana Kannan wrote:
> >
> > ...
> >
> >>> Yes this is the warning shown here [0] and this is coming from
> >>> the 'Generic PHY stmmac-0:00' device.
> >>
> >> Can you print the supplier and consumer device when this warning is
> >> happening and let me know? That'd help too. I'm guessing the phy is
> >> the consumer.
> >
> >
> > Sorry I should have included that. I added a print to dump this on
> > another build but failed to include here.
> >
> > WARNING KERN Generic PHY stmmac-0:00: supplier 2200000.gpio (status 1)
> >
> > The status is the link->status and looks like the supplier is the
> > gpio controller. I have verified that the gpio controller is probed
> > before this successfully.
> >
> >> So the warning itself isn't a problem -- it's not breaking anything or
> >> leaking memory or anything like that. But the device link is jumping
> >> states in an incorrect manner. With enough context of this code (why
> >> the device_bind_driver() is being called directly instead of going
> >> through the normal probe path), it should be easy to fix (I'll just
> >> need to fix up the device link state).
> >
> > Correct, the board seems to boot fine, we just get this warning.
>
>
> Have you had chance to look at this further?

Hi Jon,

I finally got around to looking into this. Here's the email[1] that
describes why it's done this way.

[1] - https://lore.kernel.org/lkml/YCRjmpKjK0pxKTCP@xxxxxxx/

>
> The following does appear to avoid the warning, but I am not sure if
> this is the correct thing to do ...
>
> index 9179825ff646..095aba84f7c2 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -456,6 +456,10 @@ int device_bind_driver(struct device *dev)
> {
> int ret;
>
> + ret = device_links_check_suppliers(dev);
> + if (ret)
> + return ret;
> +
> ret = driver_sysfs_add(dev);
> if (!ret)
> driver_bound(dev);

So digging deeper into the usage of device_bind_driver and looking at
[1], it doesn't look like returning an error here is a good option.
When device_bind_driver() is called, the driver's probe function isn't
even called. So, there's no way for the driver to even defer probing
based on any of the suppliers. So, we have a couple of options:

1. Delete all the links to suppliers that haven't bound. We'll still
leave the links to active suppliers alone in case it helps with
suspend/resume correctness.
2. Fix the warning to not warn on suppliers that haven't probed if the
device's driver has no probe function. But this will also need fixing
up the cleanup part when device_release_driver() is called. Also, I'm
not sure if device_bind_driver() is ever called when the driver
actually has a probe() function.

Rafael,

Option 1 above is pretty straightforward.
Option 2 would look something like what's at the end of this email +
caveat about whether the probe check is sufficient.

Do you have a preference between Option 1 vs 2? Or do you have some
other option in mind?

Thanks,
Saravana

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 5481b6940a02..8102b3c48bbc 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1247,7 +1247,8 @@ void device_links_driver_bound(struct device *dev)
*/
device_link_drop_managed(link);
} else {
- WARN_ON(link->status != DL_STATE_CONSUMER_PROBE);
+ WARN_ON(link->status != DL_STATE_CONSUMER_PROBE &&
+ dev->driver->probe);
WRITE_ONCE(link->status, DL_STATE_ACTIVE);
}

@@ -1302,7 +1303,8 @@ static void __device_links_no_driver(struct device *dev)
if (link->supplier->links.status == DL_DEV_DRIVER_BOUND) {
WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
} else {
- WARN_ON(!(link->flags & DL_FLAG_SYNC_STATE_ONLY));
+ WARN_ON(!(link->flags & DL_FLAG_SYNC_STATE_ONLY) &&
+ dev->driver->probe);
WRITE_ONCE(link->status, DL_STATE_DORMANT);
}
}