Re: [PATCH net] net: phy: realtek: clear status if link is down

From: Russell King (Oracle)
Date: Wed Jan 15 2025 - 04:25:27 EST


On Wed, Jan 15, 2025 at 05:07:22AM +0000, Daniel Golle wrote:
> Hi Andrew,
>
> On Wed, Jan 15, 2025 at 03:50:33AM +0100, Andrew Lunn wrote:
> > On Wed, Jan 15, 2025 at 12:46:11AM +0000, Daniel Golle wrote:
> > > Clear speed, duplex and master/slave status in case the link is down
> > > to avoid reporting bogus link(-partner) properties.
> > >
> > > Fixes: 5cb409b3960e ("net: phy: realtek: clear 1000Base-T link partner advertisement")
> > > Signed-off-by: Daniel Golle <daniel@xxxxxxxxxxxxxx>
> > > ---
> > > drivers/net/phy/realtek.c | 20 ++++++++++++++------
> > > 1 file changed, 14 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
> > > index f65d7f1f348e..3f0e03e2abce 100644
> > > --- a/drivers/net/phy/realtek.c
> > > +++ b/drivers/net/phy/realtek.c
> > > @@ -720,8 +720,12 @@ static int rtlgen_read_status(struct phy_device *phydev)
> > > if (ret < 0)
> > > return ret;
> > >
> > > - if (!phydev->link)
> > > + if (!phydev->link) {
> > > + phydev->duplex = DUPLEX_UNKNOWN;
> > > + phydev->master_slave_state = MASTER_SLAVE_STATE_UNKNOWN;
> > > + phydev->speed = SPEED_UNKNOWN;
> > > return 0;
> > > + }
> > >
> >
> > I must be missing something here...
> >
> >
> > rtlgen_read_status() first calls genphy_read_status(phydev);
> > [...]
> > Why is that not sufficient ?
>
> The problem are the stale NBase-T link-partner advertisement bits and the
> subsequent call to phy_resolve_aneg_linkmode(), which results in bogus
> speed and duplex, based on previously connected link partner advertising
> 2500Base-T, 5GBase-T or 10GBase-T modes.

This means you're also populating the link-partner advertisement bits
with bogus data. It would be better not to read the link status.

Does it leave the MDIO_AN_STAT1_COMPLETE bit set as well, thus causing
genphy_c45_baset1_read_lpa() to read the advertisement rather than
clearing it?

Or is it because this is buggy:

/* Vendor register as C45 has no standardized support for 1000BaseT */
if (phydev->autoneg == AUTONEG_ENABLE) {
val = phy_read_mmd(phydev, MDIO_MMD_VEND2,
RTL822X_VND2_GANLPAR);
if (val < 0)
return val;

mii_stat1000_mod_linkmode_lpa_t(phydev->lp_advertising, val);
}

This should be clearing the bits that mii_stat1000_mod_linkmode_lpa_t()
sets if autoneg is not complete.

> rtl822x_c45_read_status() calls genphy_c45_read_status(), which calls
> genphy_c45_read_lpa(), and that doesn't clear either
> ETHTOOL_LINK_MODE_1000baseT_Half_BIT nor ETHTOOL_LINK_MODE_1000baseT_Full_BIT
> as there is no generic handling for 1000Base-T in Clause-45.
>
> So also in the Clause-45 case, the subsequent call to
> phy_resolve_aneg_linkmode() may then wrongly populate speed and duplex, this
> time according to the stale 1000baseT bits.
>
> Moving the call to rtl822x_c45_read_status() in rtl822x_c45_read_status() to
> after the 1000baseT lpa bits have been taken care of fixes that part of the
> issue.
>
> Clearing master_slave_state in the C45 case is still necessary because it isn't
> done by genphy_c45_read_status().

So that's a yes then.

That's because the functions only set/clear the advertisement bits that
they control. If the PHY driver manages any other advertisement bits, it
has to _fully_ manage them. So with 1000baseT bits, as the generic
functions don't manage them, the PHY driver has to do _full_ management
of them. That includes clearing the bits when autoneg is not complete.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!