Re: [PATCH net] net: dsa: skip user_mii_bus fallback if phy-handle is set
From: Vladimir Oltean
Date: Mon Mar 09 2026 - 20:01:21 EST
On Fri, Mar 06, 2026 at 03:16:32AM +0000, Daniel Golle wrote:
> When phylink_of_phy_connect() returns -ENODEV, dsa_user_phy_setup()
> falls back to connecting the port to the PHY at dp->index on the
> switch's internal MDIO bus. This fallback was introduced to handle
> switches where no explicit phy-handle is given in DT. However, if a
> phy-handle property is present but the referenced PHY device is not
> yet available at registration time, phylink_of_phy_connect() also
> returns -ENODEV, causing the fallback to potentially attach the wrong
> PHY device instead of propagating the error.
>
> This becomes a very weird bug on switches on which the PHY address
> isn't equal to the corresponding DSA port's index, as failure to
> attach the PHY with -ENOENT then just attaches another PHY, typically
> rendering two ports unusable instead of just one (and until you read
> and understand the code it looks like an alarming memory corruption
> rather than just PHY not being ready on time).
I think you've just discovered an undocumented requirement than actually
fixing a bug with this patch. If anything, the bug is in mxl862xx.
When making the comments that led to commit aefa52a28a36 ("net: dsa:
mxl862xx: rename MDIO op arguments"), I missed the fact that
mxl862xx_setup_mdio() is also setting ds->user_mii_bus.
Please remove that.
The only case when you can share the same function pointers between a
normal MDIO bus ops and ds->user_mii_bus ops is when the MDIO addresses
are equal to the port. Because otherwise, you need to translate the
"port" given to the ds->user_mii_bus into an MDIO address. If you don't,
you'll always connect to the wrong port using the ds->user_mii_bus, if
the internal PHY is not described in OF.
> Fix this by calling fwnode_get_phy_node() before falling back:
> If a phy-handle fwnode exists, skip the internal bus fallback and let
> the -ENODEV propagate to the caller. The fallback is only taken when
> no phy-handle is present in DT, which was the original intent.
>
> Fixes: aab9c4067d238 ("net: dsa: Plug in PHYLINK support")
> Signed-off-by: Daniel Golle <daniel@xxxxxxxxxxxxxx>
> ---
> net/dsa/user.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/net/dsa/user.c b/net/dsa/user.c
> index c4bd6fe90b455..90e540f490bb3 100644
> --- a/net/dsa/user.c
> +++ b/net/dsa/user.c
> @@ -2656,6 +2656,7 @@ static int dsa_user_phy_setup(struct net_device *user_dev)
> {
> struct dsa_port *dp = dsa_user_to_port(user_dev);
> struct device_node *port_dn = dp->dn;
> + struct fwnode_handle *phy_fwnode;
> struct dsa_switch *ds = dp->ds;
> u32 phy_flags = 0;
> int ret;
> @@ -2682,9 +2683,18 @@ static int dsa_user_phy_setup(struct net_device *user_dev)
> ret = phylink_of_phy_connect(dp->pl, port_dn, phy_flags);
> if (ret == -ENODEV && ds->user_mii_bus) {
> /* We could not connect to a designated PHY or SFP, so try to
> - * use the switch internal MDIO bus instead
> + * use the switch internal MDIO bus instead. Only fall back if
> + * no phy-handle was specified in DT. If a phy-handle exists
> + * but the PHY device is missing (e.g. not yet ready at
> + * registration time), connecting to a PHY at dp->index would
> + * attach the wrong PHY device.
> */
> - ret = dsa_user_phy_connect(user_dev, dp->index, phy_flags);
> + phy_fwnode = fwnode_get_phy_node(of_fwnode_handle(port_dn));
> + if (IS_ERR(phy_fwnode))
> + ret = dsa_user_phy_connect(user_dev, dp->index,
> + phy_flags);
> + else
> + fwnode_handle_put(phy_fwnode);
Yeah, phylib seems incapable of deferring probing of a MAC driver that
connects to the PHY at probe time and doesn't find it. This is because
the code is shared with drivers that connect to the PHY at ndo_open()
time, and returning -EPROBE_DEFER to them wouldn't make any sense. So
phylib doesn't return -EPROBE_DEFER to anyone.
Actually this is one of the reasons why I was looking to upstream a
change that moves the DSA PHY connection to ndo_open(). It allows the
PHY probing to settle. I would appreciate not complicating this logic if
we don't have to.
> }
> if (ret) {
> netdev_err(user_dev, "failed to connect to PHY: %pe\n",
> --
> 2.53.0