Re: [PATCH v1 1/2] driver core: fw_devlink: Add support for FWNODE_FLAG_BROKEN_PARENT

From: Saravana Kannan
Date: Wed Sep 08 2021 - 23:23:50 EST


On Wed, Sep 8, 2021 at 6:39 PM Andrew Lunn <andrew@xxxxxxx> wrote:
>
> > --- a/net/dsa/dsa2.c
> > +++ b/net/dsa/dsa2.c
> > @@ -1286,6 +1286,17 @@ static int dsa_switch_parse_of(struct
> > dsa_switch *ds, struct device_node *dn)
> > {
> > int err;
> >
> > + /* A lot of switch devices have their PHYs as child devices and have
> > + * the PHYs depend on the switch as a supplier (Eg: interrupt
> > + * controller). With fw_devlink=on, that means the PHYs will defer
> > + * probe until the probe() of the switch completes. However, the way
> > + * the DSA framework is designed, the PHYs are expected to be probed
> > + * successfully before the probe() of the switch completes.
> > + *
> > + * So, mark the switch devices as a "broken parent" so that fw_devlink
> > + * knows not to create device links between PHYs and the parent switch.
> > + */
> > + np->fwnode.flags |= FWNODE_FLAG_BROKEN_PARENT;
> > err = dsa_switch_parse_member_of(ds, dn);
> > if (err)
> > return err;
>
> This does not work. First off, its dn, not np.

My bad. Copy paste error.

> But with that fixed, it
> still does not work. This is too late, the mdio busses have already
> been registered and probed, the PHYs have been found on the busses,
> and the PHYs would of been probed, if not for fw_devlink.

Sigh... looks like some drivers register their mdio bus in their
dsa_switch_ops->setup while others do it in their actual probe
function (which actually makes more sense to me).

>
> What did work was:
>
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
> index c45ca2473743..45d67d50e35f 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -6249,8 +6249,10 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
> if (!np && !pdata)
> return -EINVAL;
>
> - if (np)
> + if (np) {
> compat_info = of_device_get_match_data(dev);
> + np->fwnode.flags |= FWNODE_FLAG_BROKEN_PARENT;
> + }
>
> if (pdata) {
> compat_info = pdata_device_get_match_data(dev);
>
> This will fix it for mv88e6xxx. But if the same problem occurs in any
> of the other DSA drivers, they will still be broken:
>
> ~/linux/drivers/net/dsa$ grep -r mdiobus_register *
> bcm_sf2.c: err = mdiobus_register(priv->slave_mii_bus);
> dsa_loop_bdinfo.c: return mdiobus_register_board_info(&bdinfo, 1);
> lantiq_gswip.c: return of_mdiobus_register(ds->slave_mii_bus, mdio_np);
> mt7530.c: ret = mdiobus_register(bus);
> mv88e6xxx/chip.c: err = of_mdiobus_register(bus, np);
> grep: mv88e6xxx/chip.o: binary file matches
> ocelot/seville_vsc9953.c: rc = mdiobus_register(bus);
> ocelot/felix_vsc9959.c: rc = mdiobus_register(bus);
> qca/ar9331.c: ret = of_mdiobus_register(mbus, mnp);
> qca8k.c: return devm_of_mdiobus_register(priv->dev, bus, mdio);
> realtek-smi-core.c: ret = of_mdiobus_register(smi->slave_mii_bus, mdio_np);

This one would have worked because it registers it in the ->setup()
ops. So it's not a simple grep for of_mdiobus_register(). But your
point stands nonetheless.

> sja1105/sja1105_mdio.c: rc = of_mdiobus_register(bus, np);
> sja1105/sja1105_mdio.c: rc = of_mdiobus_register(bus, np);
> sja1105/sja1105_mdio.c: rc = mdiobus_register(bus);
> sja1105/sja1105_mdio.c:int sja1105_mdiobus_register(struct dsa_switch *ds)
> sja1105/sja1105.h:int sja1105_mdiobus_register(struct dsa_switch *ds);
> sja1105/sja1105_main.c: rc = sja1105_mdiobus_register(ds);
>
> If you are happy to use a big hammer:

I'm okay with this big hammer for now while we figure out something better.

>
> diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
> index 53f034fc2ef7..7ecd910f7fb8 100644
> --- a/drivers/net/phy/mdio_bus.c
> +++ b/drivers/net/phy/mdio_bus.c
> @@ -525,6 +525,9 @@ int __mdiobus_register(struct mii_bus *bus, struct module *owner)
> NULL == bus->read || NULL == bus->write)
> return -EINVAL;
>
> + if (bus->parent && bus->parent->of_node)
> + bus->parent->of_node->fwnode.flags |= FWNODE_FLAG_BROKEN_PARENT;
> +
> BUG_ON(bus->state != MDIOBUS_ALLOCATED &&
> bus->state != MDIOBUS_UNREGISTERED);
>
> So basically saying all MDIO busses potentially have a problem.
>
> I also don't like the name FWNODE_FLAG_BROKEN_PARENT. The parents are
> not broken, they work fine, if fw_devlink gets out of the way and
> allows them to do their job.

The parent assuming the child will be probed as soon as it's added is
a broken expectation/assumption. fw_devlink is just catching them
immediately.

Having said that, this is not the hill either of us should choose to
die on. So, how about something like:
FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD

If that works, I can clean up the series with this and the MDIO fix
you mentioned.

> You also asked about why the component framework is not used. DSA has
> been around for a while, the first commit dates back to October
> 2008. Russell Kings first commit for the component framework is
> January 2014. The plain driver model has worked for the last 13 years,
> so there has not been any need to change.

Thanks for the history on why it couldn't have been used earlier.

In the long run, I'd still like to fix this so that the
dsa_tree_setup() doesn't need the flag above. I have some ideas using
device links that'll be much simpler to understand and maintain than
using the component framework. I'll send out patches for that (not
meant for 5.15) later and we can go with the MDIO bus hammer for 5.15.

-Saravana