Re: [PATCH net-next] net: phy: realtek: check validity of 10GbE link-partner advertisement

From: Russell King (Oracle)
Date: Tue Oct 08 2024 - 10:31:22 EST


On Tue, Oct 08, 2024 at 01:45:03PM +0100, Russell King (Oracle) wrote:
> Let's start checking what we're doing with regards to this register.
>
> 7.33.11 (Link Partner 10GBASE-T capability) states that this is only
> valid when the Page received bit (7.1.6) has been set. This is the
> BMSR_ANEGCOMPLETE / MDIO_AN_STAT1_COMPLETE bit.
>
> Looking at rtl822x_read_status, which is called directly as a
> .read_status() method, it reads some register that might be the
> equivalent of MMD 7 Register 33 (if that's what 0xa5d, 0x13 is,
> 0xa5d, 0x12 seems to be MDIO_AN_10GBT_CTRL) whether or not the link
> is up and whether or not AN has completed. It's only conditional on
> Autoneg being enabled.
>
> However, we don't look at 7.1.6, which is wrong according to 802.3.
> So I think the first thing that's needed here is that needs fixing
> - we should only be reading the LP ability registers when (a) we
> have link, and (b) when the PHY indicates that config pages have
> been received.
>
> The next thing that needs fixing is to add support for checking
> these LOCOK/REMOK bits - and if these are specific to the result of
> the negotiation (there's some hints in 802.3 that's the case, as
> there are other registers with similar bits in, but I haven't
> looked deeply at it) then, since the resolution is done in core
> PHY code, I think we need another method into drivers to check
> these bits once resolution has occurred.

Okay, I think the problem is down to the order in which Realtek is
doing stuff.

genphy_read_status() calls genphy_update_link(), which updates
phydev->link and phydev->autoneg_complete from the BMSR, and then
goes on to call genphy_read_lpa().

Looking at genphy_read_lpa():

if (phydev->autoneg == AUTONEG_ENABLE) {
if (!phydev->autoneg_complete) {
mii_stat1000_mod_linkmode_lpa_t(phydev->lp_advertising,
0);
mii_lpa_mod_linkmode_lpa_t(phydev->lp_advertising, 0);
return 0;
}

So, if BMSR_ANEGCOMPLETE is not set, then we zero the 1G FD/HD,
Autoneg, Pause, Asym Pause and 100/10M FD/HD fields in the LPA leaving
everything else alone - and then do nothing further. In other words,
we don't read the LPA registers and update these bits.

Looking at genphy_c45_read_lpa():

if (!(val & MDIO_AN_STAT1_COMPLETE)) {
linkmode_clear_bit(ETHTOOL_LINK_MODE_Autoneg_BIT,
phydev->lp_advertising);
mii_10gbt_stat_mod_linkmode_lpa_t(phydev->lp_advertising, 0);
mii_adv_mod_linkmode_adv_t(phydev->lp_advertising, 0);
phydev->pause = 0;
phydev->asym_pause = 0;

return 0;

So that's basically the same thing - if the MDIO_AN_STAT1_COMPLETE
is clear, then we clear the 10G stuff. I think that
mii_adv_mod_linkmode_adv_t() is wrong here, it should be
mii_lpa_mod_linkmode_lpa_t().

However, the principle here is that if !autoneg_complete, then the
modes that would've been set by the respective function need to be
cleared.

Now, rtl822x_read_status() reads the 10G status, modifying
phydev->lp_advertising before then going on to call
rtlgen_read_status(), which then calls genphy_read_status(), which
in turn will then call genphy_read_lpa().

First, this is the wrong way around. Realtek needs to call
genphy_read_status() so that phydev->link and phydev->autoneg_complete
are both updated to the current status.

Then, it needs to check whether AN is enabled, and whether autoneg
has completed and deal with both situations.

Afterwards, it then *possibly* needs to read its speed register and
decode that to phydev->speed, but I don't see the point of that when
it's (a) not able to also decode the duplex from that register, and
(b) when we've already resolved it ourselves from the link mode.
What I'd be worried about is if the PHY does a down-shift to a
different speed _and_ duplex from what was resolved - and thus
whether we should even be enabling downshift on this PHY. Maybe
there's a bit in 0xa43 0x12 that gives us the duplex as well?

In other words:

static int rtl822x_read_status(struct phy_device *phydev)
{
int lpadv, ret;

ret = rtlgen_read_status(phydev);
if (ret < 0)
return ret;

if (phydev->autoneg == AUTONEG_DISABLE)
return 0;

if (!phydev->autoneg_complete) {
mii_10gbt_stat_mod_linkmode_lpa_t(phydev->lp_advertising, 0);
return 0;
}

lpadv = phy_read_paged(phydev, 0xa5d, 0x13);
if (lpadv < 0)
return lpadv;

mii_10gbt_stat_mod_linkmode_lpa_t(phydev->lp_advertising, lpadv);
phy_resolve_aneg_linkmode(phydev);

return 0;
}

That should at least get proper behaviour in the link partner
advertising bitmap rather than the weirdness that Realtek is doing.
(BTW, other drivers should be audited for the same bug!)

Now, if we still have the stale 2500M problem, then I'd suggest:

if (phydev->speed >= 2500 && !(lpadv & MDIO_AN_10GBT_STAT_LOCOK)) {
/* Possible stale advertisement causing incorrect
* resolution.
*/
mii_10gbt_stat_mod_linkmode_lpa_t(phydev->lp_advertising, 0);
phy_resolve_aneg_linkmode(phydev);
}

here.

However, if we keep the rtlgen_decode_speed() stuff, and can fix the
duplex issue, then the phy_resolve_aneg_linkmode() calls should not
be necessary, and it should be moved _after_ this to ensure that
phydev->speed (and phydev->duplex) are correctly set.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!