Re: [PATCH v1 1/2] driver core: fw_devlink: Add support for FWNODE_FLAG_BROKEN_PARENT

From: Saravana Kannan
Date: Fri Aug 27 2021 - 14:06:47 EST


On Fri, Aug 27, 2021 at 6:44 AM Andrew Lunn <andrew@xxxxxxx> wrote:
>
> > fw_devlink=on/device links short circuits the probe() call of a
> > consumer (in this case the PHY) and returns -EPROBE_DEFER if the
> > supplier's (in this case switch) probe hasn't finished without an
> > error. fw_devlink/device links effectively does the probe in graph
> > topological order and there's a ton of good reasons to do it that way
> > -- what's why fw_devlink=on was implemented.
> >
> > In this specific case though, since the PHY depends on the parent
> > device, if we fail the parent's probe realtek_smi_probe() because the
> > PHYs failed to probe, we'll get into a catch-22/chicken-n-egg
> > situation and the switch/PHYs will never probe.
>
> So lets look at:
>
> arch/arm/boot/dts/vf610-zii-dev-rev-b.dts
>
> mdio-mux {
> compatible = "mdio-mux-gpio";
> pinctrl-0 = <&pinctrl_mdio_mux>;
> pinctrl-names = "default";
> gpios = <&gpio0 8 GPIO_ACTIVE_HIGH
> &gpio0 9 GPIO_ACTIVE_HIGH
> &gpio0 24 GPIO_ACTIVE_HIGH
> &gpio0 25 GPIO_ACTIVE_HIGH>;
> mdio-parent-bus = <&mdio1>;
> #address-cells = <1>;
> #size-cells = <0>;
>
>
> We have an MDIO multiplexor
>
>
> mdio_mux_1: mdio@1 {
> reg = <1>;
> #address-cells = <1>;
> #size-cells = <0>;
>
> switch0: switch@0 {
> compatible = "marvell,mv88e6085";
> pinctrl-0 = <&pinctrl_gpio_switch0>;
> pinctrl-names = "default";
> reg = <0>;
> dsa,member = <0 0>;
> interrupt-parent = <&gpio0>;
> interrupts = <27 IRQ_TYPE_LEVEL_LOW>;
>
> On the first bus, we have a Ethernet switch.
>
> interrupt-controller;
> #interrupt-cells = <2>;
> eeprom-length = <512>;
>
> ports {
> #address-cells = <1>;
> #size-cells = <0>;
>
> port@0 {
> reg = <0>;
> label = "lan0";
> phy-handle = <&switch0phy0>;
> };
>
> The first port of that switch has a pointer to a PHY.
>
> mdio {
> #address-cells = <1>;
> #size-cells = <0>;
>
> That Ethernet switch also has an MDIO bus,
>
> switch0phy0: switch0phy0@0 {
> reg = <0>;
>
> On that bus is the PHY.
>
> interrupt-parent = <&switch0>;
> interrupts = <0 IRQ_TYPE_LEVEL_HIGH>;
>
> And that PHY has an interrupt. And that interrupt is provided by the switch.
>
> Given your description, it sounds like this is also go to break.

Based on what you pasted here (I didn't look any closer), I think it
will break too.

>
> vf610-zii-dev-rev-c.dts is the same pattern, and there are more
> examples for mv88e6xxx.
>
> It is a common pattern, e.g. the mips ar9331.dtsi follows it.

Then I think this should be solved at the DSA framework level. Make a
component-master/aggregate device made up of the switches and
ports/PHYs. Then wait for all of them to not -EPROBE_DEFER and then
initialize the DSA?

> I've not yet looked at plain Ethernet drivers. This pattern could also
> exist there. And i wonder about other complex structures, i2c bus
> multiplexors, you can have interrupt controllers as i2c devices,
> etc. So the general case could exist in other places.

I haven't seen any generic issues like this reported so far. It's only
after adding phy-handle that we are hitting these issues with DSA
switches.

> I don't think we should be playing whack-a-mole by changing drivers as
> we find they regress and break. We need a generic fix. I think the
> solution is pretty clear. As you said the device depends on its
> parent. DT is a tree, so it is easy to walk up the tree to detect this
> relationship, and not fail the probe.

It's easy to do, but it is the wrong behavior for fw_devlink=on. There
are plenty of cases where it's better to delay the child device's
probe until the parent finishes. You even gave an example[7] where it
would help avoid unnecessary deferred probes. There are plenty of
other cases like this too -- there's actually a USB driver that had an
infinite deferred probe loop that fw_devlink=on fixes. Also, the whole
point of fw_devlink=on is to enforce ordering like this -- so just
blanket ignoring dependencies on parent devices doesn't make sense.

But a parent device's probe depending on a child device's probe to
succeed as soon as it's added is never right though. So I think that's
what needs to be addresses.

So we have a couple of options:
1. Use a component driver model to initialize switches. I think it
could be doable at the DSA framework level.
2. Ask fw_devlink=on to ignore it for all switch devices -- it might
be possible to move my "quick fix" to the DSA framework.
3. Remove fw_devlink support for phy-handle.

I honestly think (1) is the best option and makes sense logically too.
Not saying it's a trivial work or a one liner, but it actually makes
sense. (2) might not be possible -- I need to take a closer look. I'd
prefer not doing (3), but I'd take that over breaking the whole point
of fw_devlink=on.

-Saravana

[7] - https://lore.kernel.org/netdev/YR5eMeKzcuYtB6Tk@xxxxxxx/