Re: [PATCH v1 1/2] driver core: fw_devlink: Add support for FWNODE_FLAG_BROKEN_PARENT

From: Saravana Kannan
Date: Thu Aug 26 2021 - 15:56:48 EST


Greg, Florian, Vladimir, Alvin,

Let's continue the rest of the discussion here.

On Thu, Aug 26, 2021 at 6:13 AM Andrew Lunn <andrew@xxxxxxx> wrote:
>
> On Thu, Aug 26, 2021 at 12:45:24AM -0700, Saravana Kannan wrote:
> > If a parent device is also a supplier to a child device, fw_devlink=on
> > (correctly) delays the probe() of the child device until the probe() of
> > the parent finishes successfully.
> >
> > However, some drivers of such parent devices (where parent is also a
> > supplier) incorrectly expect the child device to finish probing
> > successfully as soon as they are added using device_add() and before the
> > probe() of the parent device has completed successfully.
>
> Please can you point at the code making this assumption. It sounds
> like we are missing some EPROBE_DEFER handling in the driver, or maybe
> the DSA framework.

For context, this was discussed and explained in [1] and subsequent
replies. But let me summarize it here.

Alvin reported an issue that with fw_devlink=on, his downstream
hardware which is very similar to [2] doesn't have its PHYs probed
correctly. Instead of the PHYs being probed by the specific driver, it
gets probed by the "Generic PHY" driver. For those who aren't very
familiar with PHYs/networking (this is based on what Andrew explained
to me earlier), Ethernet PHYs follow a specific standard and can have
some extended functionality. The specific driver would give the full
functionality, but if it's not available when the PHY needs to be
used/connected, the generic PHY driver is force bound to the PHY and
it gives the basic functionality.

So upon digging into this, this is what I found and where I think we
have some bad assumptions about the driver core are present:

The DT node in [2] is probed by realtek_smi_probe() [3]. The call flow is:
realtek_smi_probe()
-> dsa_register_switch()
-> dsa_switch_probe()
-> dsa_tree_setup()
-> dsa_tree_setup_switches()
-> dsa_switch_setup()
-> ds->ops->setup(ds)
-> rtl8366rb_setup()
-> realtek_smi_setup_mdio()
-> of_mdiobus_register()
This scans the MDIO bus/DT and device_add()s the PHYs
-> dsa_port_setup()
-> dsa_port_link_register_of()
-> dsa_port_phylink_register()
-> phylink_of_phy_connect()
-> phylink_fwnode_phy_connect()
-> phy_attach_direct()
This checks if PHY device has already probed (by
checking for dev->driver). If not, it forces the
probe of the PHY using one of the generic PHY
drivers.

So within dsa_register_switch() the PHY device is added and then
expected to have probed in the same thread/calling context. As stated
earlier, this is not guaranteed by the driver core. And this is what
needs fixing. This works as long as the PHYs don't have dependencies
on any other devices/suppliers and never defer probe. In the issue
Alvin reported, the PHYs have a dependency and things fall apart. I
don't have a strong opinion on whether this is a framework level fix
or fixes in a few drivers.

In the specific instance of [2] (providing snippet below to make it
easier to follow), the "phy0" device [4] depends on the "switch"
device [2] since "switch_intc" (the interrupt provider for phy0) is
initialized by the "switch" driver. And fw_devlink=on delays the probe
of phy0 until switch[2] finishes probing successfully (i.e. after
dsa_register_switch() <- realtek_smi_probe() returns) -- this is the
whole point of fw_devlink=on this is what reduces the useless deferred
probes/probe attempts of consumers before the suppliers finish probing
successfully.

Since dsa_register_switch() assumes the PHYs would have been probed as
soon as they are added, but they aren't probed in this case, the PHY
is force bound to the generic PHY driver. Which is the original issue
Alvin reported. Hope this clears things up for everyone.

-Saravana

switch {
compatible = "realtek,rtl8366rb";
...

switch_intc: interrupt-controller {
...
};

ports {
...
port@0 {
phy-handle = <&phy0>;
};
port@1 {
};
...
};

mdio {
compatible = "realtek,smi-mdio";
...
phy0: phy@0 {
...
interrupt-parent = <&switch_intc>;
interrupts = <0>;
};
...
};
};

[1] - https://lore.kernel.org/netdev/CAGETcx_uj0V4DChME-gy5HGKTYnxLBX=TH2rag29f_p=UcG+Tg@xxxxxxxxxxxxxx/
[2] - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/gemini-dlink-dir-685.dts?id=73f3af7b4611d77bdaea303fb639333eb28e37d7#n190
[3] - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/dsa/realtek-smi-core.c?id=73f3af7b4611d77bdaea303fb639333eb28e37d7#n386
[4] - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/gemini-dlink-dir-685.dts?id=73f3af7b4611d77bdaea303fb639333eb28e37d7#n255