Re: [PATCH] net: phy: fix PHY possibly unwork after MDIO bus resume back

From: Heiner Kallweit
Date: Sun Apr 04 2021 - 10:10:13 EST


On 04.04.2021 12:07, Joakim Zhang wrote:
> commit 4c0d2e96ba055 ("net: phy: consider that suspend2ram may cut
> off PHY power") invokes phy_init_hw() when MDIO bus resume, it will
> soft reset PHY if PHY driver implements soft_reset callback.
> commit 764d31cacfe4 ("net: phy: micrel: set soft_reset callback to
> genphy_soft_reset for KSZ8081") adds soft_reset for KSZ8081. After these
> two patches, I found i.MX6UL 14x14 EVK which connected to KSZ8081RNB doesn't
> work any more when system resume back, MAC driver is fec_main.c.
>
> It's obvious that initializing PHY hardware when MDIO bus resume back
> would introduce some regression when PHY implements soft_reset. When I

Why is this obvious? Please elaborate on why a soft reset should break
something.

> am debugging, I found PHY works fine if MAC doesn't support suspend/resume
> or phy_stop()/phy_start() doesn't been called during suspend/resume. This
> let me realize, PHY state machine phy_state_machine() could do something
> breaks the PHY.
>
> As we known, MAC resume first and then MDIO bus resume when system
> resume back from suspend. When MAC resume, usually it will invoke
> phy_start() where to change PHY state to PHY_UP, then trigger the stat> machine to run now. In phy_state_machine(), it will start/config
> auto-nego, then change PHY state to PHY_NOLINK, what to next is
> periodically check PHY link status. When MDIO bus resume, it will
> initialize PHY hardware, including soft_reset, what would soft_reset
> affect seems various from different PHYs. For KSZ8081RNB, when it in
> PHY_NOLINK state and then perform a soft reset, it will never complete
> auto-nego.

Why? That would need to be checked in detail. Maybe chip errata
documentation provides a hint.

>
> This patch changes PHY state to PHY_UP when MDIO bus resume back, it
> should be reasonable after PHY hardware re-initialized. Also give state
> machine a chance to start/config auto-nego again.
>

If the MAC driver calls phy_stop() on suspend, then phydev->suspended
is true and mdio_bus_phy_may_suspend() returns false. As a consequence
phydev->suspended_by_mdio_bus is false and mdio_bus_phy_resume()
skips the PHY hw initialization.
Please also note that mdio_bus_phy_suspend() calls phy_stop_machine()
that sets the state to PHY_UP.

Having said that the current argumentation isn't convincing. I'm not
aware of such issues on other systems, therefore it's likely that
something is system-dependent.

Please check the exact call sequence on your system, maybe it
provides a hint.

> Signed-off-by: Joakim Zhang <qiangqing.zhang@xxxxxxx>
> ---
> drivers/net/phy/phy_device.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index cc38e326405a..312a6f662481 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -306,6 +306,13 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
> ret = phy_resume(phydev);
> if (ret < 0)
> return ret;
> +
> + /* PHY state could be changed to PHY_NOLINK from MAC controller resume
> + * rounte with phy_start(), here change to PHY_UP after re-initializing
> + * PHY hardware, let PHY state machine to start/config auto-nego again.
> + */
> + phydev->state = PHY_UP;
> +
> no_resume:
> if (phydev->attached_dev && phydev->adjust_link)
> phy_start_machine(phydev);
>