RE: [PATCH RESEND] driver core: export device_is_bound() to fix build failure

From: Aisheng Dong
Date: Wed Nov 18 2020 - 23:18:19 EST


> From: Saravana Kannan <saravanak@xxxxxxxxxx>
> Sent: Thursday, November 19, 2020 6:22 AM
>
> On Wed, Nov 18, 2020 at 7:40 AM Aisheng Dong <aisheng.dong@xxxxxxx>
> wrote:
> >
> > > From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > Sent: Wednesday, November 18, 2020 6:46 PM
> > >
> > > On Wed, Nov 18, 2020 at 10:23:42AM +0000, Aisheng Dong wrote:
> > > > Hi Greg,
> > > >
> > > > > From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > > Sent: Monday, November 9, 2020 8:48 PM
> > > > >
> > > > > On Mon, Nov 09, 2020 at 12:26:55PM +0000, Aisheng Dong wrote:
> > > > > > > From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > > > > Sent: Monday, November 9, 2020 8:05 PM
> > > > > > >
> > > > > > > On Mon, Nov 09, 2020 at 11:55:46AM +0000, Aisheng Dong wrote:
> > > > > > > > > From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > > > > > > Sent: Monday, November 9, 2020 7:41 PM
> > > > > > > > >
> > > > > > > > > On Mon, Nov 09, 2020 at 10:57:05AM +0000, Aisheng Dong
> wrote:
> > > > > > > > > > Hi Greg,
> > > > > > > > > >
> > > > > > > > > > > From: Greg Kroah-Hartman
> > > > > > > > > > > <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > > > > > > > > Sent: Monday, November 9, 2020 6:37 PM
> > > > > > > > > > > Subject: Re: [PATCH RESEND] driver core: export
> > > > > > > > > > > device_is_bound() to fix build failure
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Nov 09, 2020 at 10:14:46AM +0000, Sudip
> > > > > > > > > > > Mukherjee
> > > > > wrote:
> > > > > > > > > > > > Hi Greg,
> > > > > > > > > > > >
> > > > > > > > > > > > On Sun, Nov 8, 2020 at 8:23 AM Greg Kroah-Hartman
> > > > > > > > > > > > <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sat, Nov 07, 2020 at 10:47:27PM +0000, Sudip
> > > > > > > > > > > > > Mukherjee
> > > > > wrote:
> > > > > > > > > > > > > > When CONFIG_MXC_CLK_SCU is configured as 'm'
> > > > > > > > > > > > > > the build fails as it is unable to find
> > > > > > > > > > > > > > device_is_bound(). The error
> > > being:
> > > > > > > > > > > > > > ERROR: modpost: "device_is_bound"
> > > > > > > [drivers/clk/imx/clk-imx-scu.ko]
> > > > > > > > > > > > > > undefined!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Export the symbol so that the module finds it.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Fixes: 77d8f3068c63 ("clk: imx: scu: add two
> > > > > > > > > > > > > > cells binding
> > > > > > > > > > > > > > support")
> > > > > > > > > > > > > > Signed-off-by: Sudip Mukherjee
> > > > > > > > > > > > > > <sudipm.mukherjee@xxxxxxxxx>
> > > > > > > > > > > > > > ---
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > resending with the Fixes: tag.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > drivers/base/dd.c | 1 +
> > > > > > > > > > > > > > 1 file changed, 1 insertion(+)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > diff --git a/drivers/base/dd.c
> > > > > > > > > > > > > > b/drivers/base/dd.c index
> > > > > > > > > > > > > > 148e81969e04..a796a57e5efb 100644
> > > > > > > > > > > > > > --- a/drivers/base/dd.c
> > > > > > > > > > > > > > +++ b/drivers/base/dd.c
> > > > > > > > > > > > > > @@ -353,6 +353,7 @@ bool
> > > > > > > > > > > > > > device_is_bound(struct device
> > > > > > > > > > > > > > *dev)
> > > > > > > {
> > > > > > > > > > > > > > return dev->p &&
> > > > > > > > > > > > > > klist_node_attached(&dev->p->knode_driver);
> > > > > > > > > > > > > > }
> > > > > > > > > > > > > > +EXPORT_SYMBOL(device_is_bound);
> > > > > > > > > > > > >
> > > > > > > > > > > > > EXPORT_SYMBOL_GPL() please, like all the other
> > > > > > > > > > > > > exports in this
> > > > > file.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Also, wait, no, don't call this, are you sure
> > > > > > > > > > > > > you are calling it in a race-free way? And what
> > > > > > > > > > > > > branch/tree is the above
> > > > > > > commit in?
> > > > > > > > > > > >
> > > > > > > > > > > > I have not checked fully but since it is being
> > > > > > > > > > > > called from
> > > > > > > > > > > > probe() I assume the lock will be held at that time.
> > > > > > > > > > >
> > > > > > > > > > > probe() should never call this function as it makes
> > > > > > > > > > > no sense at all at that point in time. The driver should be
> fixed.
> > > > > > > > > >
> > > > > > > > > > Would you suggest if any other API we can use to allow
> > > > > > > > > > the driver to know whether another device has been probed?
> > > > > > > > >
> > > > > > > > > There is none, sorry, as that just opens up way too many
> problems.
> > > > > > > > >
> > > > > > > > > > For imx scu driver in question, it has a special
> > > > > > > > > > requirement that it depends on scu power domain driver.
> > > > > > > > > > However, there're a huge number
> > > > > > > > > > (200+) of power domains for each device clock, we
> > > > > > > > > > can't define them all in DT
> > > > > > > > > for a single clock controller node.
> > > > > > > > > >
> > > > > > > > > > That's why we wanted to use device_is_bound() before
> > > > > > > > > > to check if scu power domain is ready or not to support defer
> probe.
> > > > > > > > >
> > > > > > > > > Use the device link functionality for this type of
> > > > > > > > > thing, that is what it was created for.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Thanks for the suggestion. I will check it how to use.
> > > > > > > > BTW, I wonder if dev_driver_string() could be an optional
> > > > > > > > solution which seems a more simple way?
> > > > > > >
> > > > > > > Also, how do you really know you even have a valid pointer
> > > > > > > to that other device structure? How are you getting access to that?
> > > > > > >
> > > > > >
> > > > > > The rough idea is as follows. Not sure if those APIs are safe
> > > > > > enough as there're many users In kernel.
> > > > > >
> > > > > > pd_np = of_find_compatible_node(NULL, NULL, "fsl,scu-pd");
> > > > > > pd_dev = of_find_device_by_node(pd_np); if (!pd_dev ||
> > > > > > !dev_driver_string(&pd_dev->dev) ||
> > > > > > strcmp(dev_driver_string(&pd_dev->dev), "imx-scu-pd")) {
> > > > > > of_node_put(pd_np);
> > > > > > return -EPROBE_DEFER;
> > > > > > }
> > > > >
> > > > > Ick, again, no, don't do that, you can not guarantee "names" of
> > > > > devices anywhere in the system, sorry.
> > > >
> > > > I tried to understand how to use devlink for this case by diving
> > > > deep into the devlink code, however, it looked to me there're a
> > > > few limitations on
> > > the current devlink usage.
> > >
> > > Adding Saravana, who wrote that code to help explain it.
> > >
> > > > We can't create driver presence dependency link in consumer's
> > > > probe function while the supplier driver is still not probed or even not
> created yet.
> > > > (we're the later case that supplier device scu-pd may be created
> > > > after scu-clk
> > > device).
> > >
> > > Sounds like your DT is set up backwards?
> >
> > Yes, the dts is like below:
> >
> > scu {
> > compatible = "fsl,imx-scu";
> > mbox-names = "tx0",
> > "rx0",
> > "gip3";
> > mboxes = <&lsio_mu1 0 0
> > &lsio_mu1 1 0
> > &lsio_mu1 3 3>;
> >
> > clk: clock-controller {
> > compatible = "fsl,imx8qxp-clk";
> > #clock-cells = <1>;
> > clocks = <&xtal32k &xtal24m>;
> > clock-names = "xtal_32KHz", "xtal_24Mhz";
> > };
> >
> > pd: imx8qx-pd {
> > compatible = "fsl,imx8qxp-scu-pd";
> > #power-domain-cells = <1>;
> > };
> > ...
> >
> > Clk and pd devices under scu node are propagated by scu driver in DT order.
> > If moving pd node before clk node can also address this issue.
>
> I can bet the DT maintainers are going to NACK a DT node reordering.
> The ordering of DT nodes should not matter for correctness.
>
> >
> > >
> > > > The kernel doc Documentation/driver-api/device_link.rst also
> > > > mentioned this limitation and the suggested solution is:
> > > > "The onus is thus on the consumer to check presence of the
> > > > supplier after adding the link, and defer probing on non-presence."
> > > >
> > > > Then the question is , back again, , how to check the presense of
> > > > other device driver if we can't use device_is_bound() API in module?
> > >
> > > Your driver should not care, as you can't get direct access to it,
> > > so don't try to ask the driver core about it as that is racy and not viable.
> > >
> > > If your driver needs resources that it can not get at this point in
> > > time, just return from probe with a defer error. That way it will
> > > be called again after other drivers load.
> > >
> >
> > Referring to above dts snippet, you can see the problem here is that
> > there're no power domain property in clock-controller node which cause
> > the clock driver unable to use the Normal/standard way to acquire PD
> > resources and return a EPROBE_DEFFER if failed like
>
> _THIS_ is your main problem. If you add the power domain property, your
> problem would be solved. Earlier in the thread you said:
>
> > For imx scu driver in question, it has a special requirement that it
> > depends on scu power domain driver. However, there're a huge number
> > (200+) of power domains for each device clock, we can't define them all in DT
> for a single clock controller node.
>
> But this just seems like an arbitrary position you are taking and then digging
> yourself into a hole.
>
> Does the clock controller actually depend on the power domain? If it does, then
> list it in DT. If it doesn't and the real dependency is one of the consumer devices
> that consume the clock provided by this clock controller, then your clock
> controller should not be waiting/deferring probe on this power domain. Leave
> that to the actual device that depends on the power domain.

It's not that case.
The clocks provided by clock provider depends on a big number of various power domains.

More strictly, the clock-controller node itself (clock protocol) does not depends on power domain.
But the sub clocks created by clock protocol driver depends on each resource power domain.
See: imx_clk_scu_alloc_dev()
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/clk/imx/clk-scu.c

That's why we didn't list power domains in clock-controller node.

>
> What are you even achieving by not listing the power domain in the DT node of
> the clock controller if you are then going to have the clock controller block on
> the power domain having probed?
>
> > Other resources, e.g. irq, gpio and etc.
> > That's the main reason we use device_is_bound() in clock driver to
> > check if PD driver has been probed.
> >
> > And of_genpd_add_device() we used in scu clk driver does not support
> > EPROBE_DEFER and in order to maintainer the backwards compatibility of
> > imx_clk_scu_alloc_dev() API, We also can't return PROBE_DEFER. All
> > those are special reasons which made things complicated and un-easy to
> address in standard way.
>
> This seems to be talking about some out of tree code? So can't really help you
> without seeing the code. And I don't understand what backward compatibility
> you are trying to maintain within kernel code. That's not something we try to or
> need to do. Just fix up the users.

Pls refer to above link for the code.

>
> > Maybe the simplest solution is justing move scu pd node above scu clk
> > node which is also make Sense because it can save a huge number of defer
> probes.
>
> This is 100% going to get rejected upstream. Even if you reorder the DT nodes,
> there's no guarantee that your power domain device would get probed before
> the clock driver.
>

The point here is saving a huge number of defer probes.
We could find a way to ensure power domain driver get probed before clock.

> Long story short, either
>
> * Don't care about the power domain in your clock driver.
>
> Or,
>
> * List the power domain in the clock controller's DT node and then use the
> normal APIs to get the power domain. And defer like any other driver if you
> can't get the power domain.
>

Yes, I understand those are for normal cases. But our case is a bit different and we don't want
imx_clk_scu() API return PROBE_DEFER which is unnecessary for a hundred of clocks.
Even we want to defer probe, we prefer to defer in imx_clk_scu_init() rather than in imx_clk_scu().

Maybe the things can be simplified as a simple requirement:
How users can make Driver A (CLK) to be probed after Driver B (PD) without explicit
firmware function dependency description (e.g. phandle in DT)?

As kernel core does not want to support it, then the left way may be change scu-pd driver
Inicall level or provide a private callback to query the readiness.

Regards
Aisheng

> -Saravana