Re: [net-next PATCH 3/6] net: phylink: Correctly handle PCS probe defer from PCS provider

From: Russell King (Oracle)
Date: Wed Mar 19 2025 - 12:02:21 EST


On Wed, Mar 19, 2025 at 12:58:39AM +0100, Christian Marangi wrote:
> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> index 7f71547e89fe..c6d9e4efed13 100644
> --- a/drivers/net/phy/phylink.c
> +++ b/drivers/net/phy/phylink.c
> @@ -1395,6 +1395,15 @@ static void phylink_major_config(struct phylink *pl, bool restart,
> if (pl->mac_ops->mac_select_pcs) {
> pcs = pl->mac_ops->mac_select_pcs(pl->config, state->interface);
> if (IS_ERR(pcs)) {
> + /* PCS can be removed unexpectedly and not available
> + * anymore.
> + * PCS provider will return probe defer as the PCS
> + * can't be found in the global provider list.
> + * In such case, return -ENOENT as a more symbolic name
> + * for the error message.
> + */
> + if (PTR_ERR(pcs) == -EPROBE_DEFER)
> + pcs = ERR_PTR(-ENOENT);

I don't particularly like the idea of returning -EPROBE_DEFER from
mac_select_pcs()... there is no way *ever* that such an error code
could be handled.

> linkmode_fill(pl->supported);
> linkmode_copy(pl->link_config.advertising, pl->supported);
> - phylink_validate(pl, pl->supported, &pl->link_config);
> + ret = phylink_validate(pl, pl->supported, &pl->link_config);
> + /* The PCS might not available at the time phylink_create
> + * is called. Check this and communicate to the MAC driver
> + * that probe should be retried later.
> + *
> + * Notice that this can only happen in probe stage and PCS
> + * is expected to be avaialble in phylink_major_config.
> + */
> + if (ret == -EPROBE_DEFER) {
> + kfree(pl);
> + return ERR_PTR(ret);
> + }

This does not solve the problem - what if the interface mode is
currently not one that requires a PCS that may not yet be probed?

I don't like the idea that mac_select_pcs() might be doing a complex
lookup - that could make scanning the interface modes (as
phylink_validate_mask() does) quite slow and unreliable, and phylink
currently assumes that a PCS that is validated as present will remain
present.

If it goes away by the time phylink_major_config() is called, then we
leave the phylink state no longer reflecting how the hardware is
programmed, but we still continue to call mac_link_up() - which should
probably be fixed.

Given that netdev is severely backlogged, I'm not inclined to add to
the netdev maintainers workloads by trying to fix this until after
the merge window - it looks like they're at least one week behind.
Consequently, I'm expecting that most patches that have been
submitted during this week will be dropped from patchwork, which
means submitting patches this week is likely not useful.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!