RE: [PATCH v1] net: phy: adin: Add flags to disable enhanced link detection
From: Ken Sloat
Date: Wed Mar 01 2023 - 09:08:35 EST
Hi Andrew,
> -----Original Message-----
> From: Andrew Lunn <andrew@xxxxxxx>
> Sent: Wednesday, March 1, 2023 8:03 AM
> To: Jakub Kicinski <kuba@xxxxxxxxxx>
> Cc: Ken Sloat <ken.s@xxxxxxxxxxxxx>; Michael Hennerich
> <michael.hennerich@xxxxxxxxxx>; Heiner Kallweit
> <hkallweit1@xxxxxxxxx>; Russell King <linux@xxxxxxxxxxxxxxx>; David S.
> Miller <davem@xxxxxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH v1] net: phy: adin: Add flags to disable enhanced link
> detection
>
> On Tue, Feb 28, 2023 at 07:31:05PM -0800, Jakub Kicinski wrote:
> > On Tue, 28 Feb 2023 16:19:07 +0100 Andrew Lunn wrote:
> > > The Marvell PHYs also support a fast link down mode, so i think
> > > using fast link down everywhere, in the code and the commit message
> > > would be good. How about adin_fast_down_disable().
> >
> > Noob question - does this "break the IEEE standard" from the MAC<>PHY
> > perspective or the media perspective? I'm guessing it's the former and
> > the setting will depend on the MAC, given configuration via the DT?
>
> IEEE 802.3 says something like you need to wait 1 second before declaring
> the link down. For applications like MetroLAN, 1 second is too long, they
> want to know within something like 50ms so they can swap to a hot standby.
>
> Marvell PHYs have something similar, there is a register you can poke to
> shorten the time it waits until it declares the link down. I'm sure others PHYs
> have it too.
>
> Ah, we already have a PHY tunable for it, ETHTOOL_PHY_FAST_LINK_DOWN.
> I had forgotten about that. The Marvell PHY supports its.
>
> So i have two questions i guess:
>
> 1) Since it is not compliant with 802.3 by default, do we actually want it
> disabled by default? But is that going to cause regressions?
> Or there devices actually making use of this feature of this PHY?
>
I would think you have a risk here like you said of regression, perhaps some users are not even aware of this feature, but their system is somehow reliant on it.
I am not an IEEE expert, but by examining the datasheet, we can see that clearing this bit alone does not guarantee IEEE compliance. There is another related feature, which is "1000BASE-T retrain" which is disabled by default because as the datasheet explains, it should not be enabled when "Enhanced Link Detection" is enabled (default). It further explains that "Clause 40.4.6.1 of Standard 802.3 requires a PHY that is operating in 1000BASE-T mode to retrain if it detects that its receiver status is not okay." So technically, you would also need to reverse this default as well if IEEE compliance was the goal - and perhaps there are even others.
The motivating reason for this change is a customer is having broken link issues, and as the ADI datasheet suggests "Having enhanced link detection enabled is not suitable for all applications because it causes the PHY to react quickly to high levels of disturbance on the MDI lines. This configuration needs to be considered when performing conformance testing and EMC testing where the media-dependent interface can be exposed to fast transients. These transients may trigger enhanced link detection to bring the link down during such tests. In this case, it is preferred to configure all bits in the FLD_EN register to 0."
Moreover, defaulting it to opposite the HW default might be confusing for a user inspecting the datasheet. If we provide a parameter though, we allow for a reasonable way to override it if the feature causes a problem for the user.
> 2) Rather than a vendor specific DT bool to disable it, should we add a generic
> DT property listing the actual delay in milliseconds, which basically does what
> the PHY tunable does.
>
> I think the answer to the second question should be Yes. It is a bit more
> effort for this change, but is a generic solution.
>
How would the new structure look in your mind? I can foresee two obvious implementations:
1. Each PHY driver that wants to implement this feature would add a device_property_read_u32() call to read some DT param like "fast-down-threshold-ms" and then set associated registers.
2. The dt portion of #1 is done at a higher level which then calls down to the phy set_tunable with ETHTOOL_PHY_FAST_LINK_DOWN (not sure if that's proper or not). In the ADIN case, this has the added benefit of providing the ability for ethtool to set this.
I also see further complication with the ADIN PHY though. This PHY doesn't support a threshold value, but rather this feature is full on or full off. While it is not hard to just compare if set >= 750 mS and then turn feature off, as the datasheet says "more than either 350 ms or 750 ms in 1000BASE-T, depending if the PHY is 1000BASE-T master or 1000BASE-T slave." Also, I am not sure what the standard says about 100BASE-TX and if you see in my changes there are separate bit sets for each - hence the two properties.
> I was pondering the first question while reviewing and decided to say
> nothing. There is a danger of regressions. But as this case shows, it can also
> cause problems.
>
> Andrew
Perhaps I am overcomplicating this, but I am open to suggestions and will await your feedback.
Thanks!
Sincerely,
Ken Sloat