Re: [RFC PATCH 2/2] net: phy: mxl-gpy: add new device tree property to disable SGMII autoneg

From: Russell King (Oracle)
Date: Thu Apr 18 2024 - 11:02:49 EST


On Wed, Apr 17, 2024 at 09:22:50AM +0200, Stefan Eichenberger wrote:
> On Tue, Apr 16, 2024 at 07:12:49PM +0100, Russell King (Oracle) wrote:
> > On Tue, Apr 16, 2024 at 07:23:03PM +0200, Stefan Eichenberger wrote:
> > > Hi Russell and Andrew,
> > >
> > > On Tue, Apr 16, 2024 at 05:24:02PM +0100, Russell King (Oracle) wrote:
> > > > On Tue, Apr 16, 2024 at 06:02:08PM +0200, Andrew Lunn wrote:
> > > > > On Tue, Apr 16, 2024 at 05:43:16PM +0200, Stefan Eichenberger wrote:
> > > > > > Hi Andrew,
> > > > > >
> > > > > > Thanks a lot for the feedback.
> > > > > >
> > > > > > On Tue, Apr 16, 2024 at 03:46:19PM +0200, Andrew Lunn wrote:
> > > > > > > On Tue, Apr 16, 2024 at 02:10:32PM +0200, Stefan Eichenberger wrote:
> > > > > > > > Add a new device tree property to disable SGMII autonegotiation and
> > > > > > > > instead use the option to match the SGMII speed to what was negotiated
> > > > > > > > on the twisted pair interface (tpi).
> > > > > > >
> > > > > > > Could you explain this is more detail.
> > > > > > >
> > > > > > > SGMII always runs its clocks at 1000Mbps. The MAC needs to duplicate
> > > > > > > the symbols 100 times when running at 10Mbs, and 10 times when running
> > > > > > > at 100Mbps.
> > > > > >
> > > > > > Currently, the mxl-gpy driver uses SGMII autonegotiation for 10 Mbps,
> > > > > > 100 Mbps, and 1000 Mbps. For our Ethernet controller, which is on an
> > > > > > Octeon TX2 SoC, this means that we have to enable "in-band-status" on
> > > > > > the controller. This will work for all three speed settings. However, if
> > > > > > we have a link partner that can do 2.5 Gbps, the mxl-gpy driver will
> > > > > > disable SGMII autonegotiation in gpy_update_interface. This is not
> > > > > > supported by this Ethernet controller because in-band-status is still
> > > > > > enabled. Therefore, we will not be able to transfer data at 2.5 Gbps,
> > > > > > the SGMII link will not go into a working state.
> > > > >
> > > > > This is where i expect Russel to point out that SGMII does not support
> > > > > 2.5G. What you actually mean is that the PHY swaps to 2500BaseX. And
> > > > > 2500BaseX does not perform speed negotiation, since it only supports
> > > > > 2500. So you also need the MAC to swap to 2500BaseX.
> > > >
> > > > Yes, absolutely true that SGMII does not support 2.5G... and when
> > > > operating faster, than 2500base-X is normally used.
> > > >
> > > > How, 2500base-X was slow to be standardised, and consequently different
> > > > manufacturers came up with different ideas. The common theme is that
> > > > it's 1000base-X up-clocked by 2.5x. Where the ideas differ is whether
> > > > in-band negotiation is supported or not. This has been a pain point for
> > > > a while now.
> > > >
> > > > As I mentioned in my previous two messages, I have an experimental
> > > > patch series that helps to address this.
> > > >
> > > > The issue is that implementations mix manufacturers, so we need to
> > > > know the capabilities of the PHY and the capabilities of the PCS, and
> > > > then hope that we can find some common ground between their
> > > > requirements.
> > > >
> > > > There is then the issue that if you're not using phylink, then...
> > > > guess what... you either need to convert to use phylink or implement
> > > > the logic in your own MAC driver to detect what the PHY is doing
> > > > and what its capabilities are - but I think from what you've said,
> > > > you are using phylink.
> > >
> > > Thanks for the patch series and the explanation. In our use case we have
> > > the mismatch between the PHY and the mvpp2 driver in 2500BaseX mode.
> >
> > Ah, mvpp2. This is one of those cases where I think you have a
> > disagreement between manufacturers over 2500base-X.
> >
> > Marvell's documentation clearly states that when operating in 1000base-X
> > mode, AN _must_ be enabled. Since programming 2500base-X is programming
> > the mvpp2 hardware for 1000base-X and then configuring the COMPHY to
> > clock faster, AN must also be enabled when operating at 2500base-X.
> >
> > It seems you've coupled it with the mxl-gpy PHY which doesn't apparently
> > support AN when in 2500base-X.
> >
> > Welcome to the mess of 2500base-X, and sadly we finally have the
> > situation that I've feared for years: one end of a 2500base-X link
> > that's documented as requiring AN, and the other end not providing it.
> >
> > Sigh. If only the IEEE 802.3 committee had been more pro-active and
> > standardised 2500base-X _before_ manufacturers went off and did their
> > own different things.
>
> I also checked the datasheet and you are right about the 1000base-X mode
> and in-band AN. What worked for us so far was to use SGMII mode even for
> 2.5Gbps and disable in-band AN (which is possible for SGMII). I think
> this works because as you wrote, the genphy just multiplies the clock by
> 2.5 and doesn't care if it's 1000base-X or SGMII. With your patches we
> might even be able to use in-band autonegoation for 10,100 and 1000Mbps
> and then just disable it for 2.5Gbps. I need to test it, but I have hope
> that this should work.

There is another way we could address this. If the querying support
had a means to identify that the endpoint supports bypass mode, we
could then have phylink identify that, and arrange to program the
mvpp2 end to be in 1000base-X + x2.5 clock + AN bypass, which would
mean it wouldn't require the inband 16-bit word to be present.

I haven't fully thought it through yet - for example, I haven't
considered how we should indicate to the PCS that AN bypass mode
should be enabled or disabled via the pcs_config() method.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!