Re: [PATCH net v2 1/2] net: dsa: lantiq_gswip: Don't use PHY auto polling

From: Vladimir Oltean
Date: Thu Apr 08 2021 - 18:46:25 EST


On Thu, Apr 08, 2021 at 08:38:27PM +0200, Martin Blumenstingl wrote:
> PHY auto polling on the GSWIP hardware can be used so link changes
> (speed, link up/down, etc.) can be detected automatically. Internally
> GSWIP reads the PHY's registers for this functionality. Based on this
> automatic detection GSWIP can also automatically re-configure it's port
> settings. Unfortunately this auto polling (and configuration) mechanism
> seems to cause various issues observed by different people on different
> devices:
> - FritzBox 7360v2: the two Gbit/s ports (connected to the two internal
> PHY11G instances) are working fine but the two Fast Ethernet ports
> (using an AR8030 RMII PHY) are completely dead (neither RX nor TX are
> received). It turns out that the AR8030 PHY sets the BMSR_ESTATEN bit
> as well as the ESTATUS_1000_TFULL and ESTATUS_1000_XFULL bits. This
> makes the PHY auto polling state machine (rightfully?) think that the
> established link speed (when the other side is Gbit/s capable) is
> 1Gbit/s.

Why do you say "rightfully"? The PHY is gigabit capable, and it reports
that via the Extended Status register. This is one of the reasons why
the "advertising" and "supported" link modes are separate concepts,
because even though you support gigabit, you don't advertise it because
you are in RMII mode.

How does turning off the auto polling feature help circumvent the
Atheros PHY reporting "issue"? Do we even know that is the problem, or
is it simply a guess on your part based on something that looked strange?

> - None of the Ethernet ports on the Zyxel P-2812HNU-F1 (two are
> connected to the internal PHY11G GPHYs while the other three are
> external RGMII PHYs) are working. Neither RX nor TX traffic was
> observed. It is not clear which part of the PHY auto polling state-
> machine caused this.

Great.

> - FritzBox 7412 (only one LAN port which is connected to one of the
> internal GPHYs running in PHY22F / Fast Ethernet mode) was seeing
> random disconnects (link down events could be seen). Sometimes all
> traffic would stop after such disconnect. It is not clear which part
> of the PHY auto polling state-machine cauased this.
> - TP-Link TD-W9980 (two ports are connected to the internal GPHYs
> running in PHY11G / Gbit/s mode, the other two are external RGMII
> PHYs) was affected by similar issues as the FritzBox 7412 just without
> the "link down" events
>
> Switch to software based configuration instead of PHY auto polling (and
> letting the GSWIP hardware configure the ports automatically) for the
> following link parameters:
> - link up/down
> - link speed
> - full/half duplex
> - flow control (RX / TX pause)

What does the auto polling feature consist of, exactly? Is there some
sort of microcontroller accessing the MDIO bus simultaneously with
Linux?

> After a big round of manual testing by various people (who helped test
> this on OpenWrt) it turns out that this fixes all reported issues.
>
> Additionally it can be considered more future proof because any
> "quirk" which is implemented for a PHY on the driver side can now be
> used with the GSWIP hardware as well because Linux is in control of the
> link parameters.
>
> As a nice side-effect this also solves a problem where fixed-links were
> not supported previously because we were relying on the PHY auto polling
> mechanism, which cannot work for fixed-links as there's no PHY from
> where it can read the registers. Configuring the link settings on the
> GSWIP ports means that we now use the settings from device-tree also for
> ports with fixed-links.
>
> Fixes: 14fceff4771e51 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
> Fixes: 3e6fdeb28f4c33 ("net: dsa: lantiq_gswip: Let GSWIP automatically set the xMII clock")
> Cc: stable@xxxxxxxxxxxxxxx
> Acked-by: Hauke Mehrtens <hauke@xxxxxxxxxx>
> Reviewed-by: Andrew Lunn <andrew@xxxxxxx>
> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@xxxxxxxxxxxxxx>
> ---