Re: [PATCH v1 1/2] driver core: fw_devlink: Add support for FWNODE_FLAG_BROKEN_PARENT

From: Andrew Lunn
Date: Thu Sep 09 2021 - 11:00:37 EST


> Sigh... looks like some drivers register their mdio bus in their
> dsa_switch_ops->setup while others do it in their actual probe
> function (which actually makes more sense to me).

Drivers are free to do whatever they want. The driver model allows it.

> I'm okay with this big hammer for now while we figure out something
> better.


>
> >
> > diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
> > index 53f034fc2ef7..7ecd910f7fb8 100644
> > --- a/drivers/net/phy/mdio_bus.c
> > +++ b/drivers/net/phy/mdio_bus.c
> > @@ -525,6 +525,9 @@ int __mdiobus_register(struct mii_bus *bus, struct module *owner)
> > NULL == bus->read || NULL == bus->write)
> > return -EINVAL;
> >
> > + if (bus->parent && bus->parent->of_node)
> > + bus->parent->of_node->fwnode.flags |= FWNODE_FLAG_BROKEN_PARENT;
> > +
> > BUG_ON(bus->state != MDIOBUS_ALLOCATED &&
> > bus->state != MDIOBUS_UNREGISTERED);
> >
> > So basically saying all MDIO busses potentially have a problem.
> >
> > I also don't like the name FWNODE_FLAG_BROKEN_PARENT. The parents are
> > not broken, they work fine, if fw_devlink gets out of the way and
> > allows them to do their job.
>
> The parent assuming the child will be probed as soon as it's added is
> a broken expectation/assumption. fw_devlink is just catching them
> immediately.

Why is it broken? As i said in the history, DSA has worked since
2008. This behaviour is not that old, but it has been used and worked
for a number of years.

I actual think your model of the driver model is too simple and needs
to accept that a driver probe is not atomic. Resources a driver
registers with other parts of the kernel can be used before the probe
completes. And we have some corners of the kernel that depend on that.

This is particularly true for network drivers. As soon as you register
a network interface to the stack, it will start using it, before the
probe function has completed. It does not wait around for the driver
core to say it has completed probing. And i doubt this is unique to
networking. Maybe when a frame buffer driver registers a frame buffer
with the core, the core starts to draw the splash screen, before the
probe finishes? Maybe when a block driver registers a block device,
the core starts reading the partition table, before the probe
finishes? These are all examples of using a resource before the probe
completes.

> Having said that, this is not the hill either of us should choose to
> die on. So, how about something like:
> FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD
>
> If that works, I can clean up the series with this and the MDIO fix
> you mentioned.

That is O.K. for me as a fix. I can test patches next week.

Andrew