RE: [PATCH] net: phy: fix PHY possibly unwork after MDIO bus resume back

From: Joakim Zhang
Date: Mon Apr 05 2021 - 21:37:35 EST



Hi Heiner,

Thanks for your comments.

> -----Original Message-----
> From: Heiner Kallweit <hkallweit1@xxxxxxxxx>
> Sent: 2021年4月4日 22:09
> To: Joakim Zhang <qiangqing.zhang@xxxxxxx>; andrew@xxxxxxx;
> linux@xxxxxxxxxxxxxxx; davem@xxxxxxxxxxxxx; kuba@xxxxxxxxxx
> Cc: netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; dl-linux-imx
> <linux-imx@xxxxxxx>; christian.melki@xxxxxxxxxx
> Subject: Re: [PATCH] net: phy: fix PHY possibly unwork after MDIO bus resume
> back
>
> On 04.04.2021 12:07, Joakim Zhang wrote:
> > commit 4c0d2e96ba055 ("net: phy: consider that suspend2ram may cut off
> > PHY power") invokes phy_init_hw() when MDIO bus resume, it will soft
> > reset PHY if PHY driver implements soft_reset callback.
> > commit 764d31cacfe4 ("net: phy: micrel: set soft_reset callback to
> > genphy_soft_reset for KSZ8081") adds soft_reset for KSZ8081. After
> > these two patches, I found i.MX6UL 14x14 EVK which connected to
> > KSZ8081RNB doesn't work any more when system resume back, MAC driver
> is fec_main.c.
> >
> > It's obvious that initializing PHY hardware when MDIO bus resume back
> > would introduce some regression when PHY implements soft_reset. When I
>
> Why is this obvious? Please elaborate on why a soft reset should break
> something.

This obvious since only above two fixes work together can trigger this issue at my side.

> > am debugging, I found PHY works fine if MAC doesn't support
> > suspend/resume or phy_stop()/phy_start() doesn't been called during
> > suspend/resume. This let me realize, PHY state machine
> > phy_state_machine() could do something breaks the PHY.
> >
> > As we known, MAC resume first and then MDIO bus resume when system
> > resume back from suspend. When MAC resume, usually it will invoke
> > phy_start() where to change PHY state to PHY_UP, then trigger the
> > stat> machine to run now. In phy_state_machine(), it will start/config
> > auto-nego, then change PHY state to PHY_NOLINK, what to next is
> > periodically check PHY link status. When MDIO bus resume, it will
> > initialize PHY hardware, including soft_reset, what would soft_reset
> > affect seems various from different PHYs. For KSZ8081RNB, when it in
> > PHY_NOLINK state and then perform a soft reset, it will never complete
> auto-nego.
>
> Why? That would need to be checked in detail. Maybe chip errata
> documentation provides a hint.
>
> >
> > This patch changes PHY state to PHY_UP when MDIO bus resume back, it
> > should be reasonable after PHY hardware re-initialized. Also give
> > state machine a chance to start/config auto-nego again.
> >
>
> If the MAC driver calls phy_stop() on suspend, then phydev->suspended is true
> and mdio_bus_phy_may_suspend() returns false. As a consequence
> phydev->suspended_by_mdio_bus is false and mdio_bus_phy_resume()
> skips the PHY hw initialization.

Per my debugging, MDIO bus suspended before MAC, I think it is a common behavior,
so PHY hw initialization is always called. Does this behavior be system-dependent?

Best Regards,
Joakim Zhang
> Please also note that mdio_bus_phy_suspend() calls phy_stop_machine() that
> sets the state to PHY_UP.
>
> Having said that the current argumentation isn't convincing. I'm not aware of
> such issues on other systems, therefore it's likely that something is
> system-dependent.
>
> Please check the exact call sequence on your system, maybe it provides a hint.
>
> > Signed-off-by: Joakim Zhang <qiangqing.zhang@xxxxxxx>
> > ---
> > drivers/net/phy/phy_device.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/net/phy/phy_device.c
> > b/drivers/net/phy/phy_device.c index cc38e326405a..312a6f662481 100644
> > --- a/drivers/net/phy/phy_device.c
> > +++ b/drivers/net/phy/phy_device.c
> > @@ -306,6 +306,13 @@ static __maybe_unused int
> mdio_bus_phy_resume(struct device *dev)
> > ret = phy_resume(phydev);
> > if (ret < 0)
> > return ret;
> > +
> > + /* PHY state could be changed to PHY_NOLINK from MAC controller
> resume
> > + * rounte with phy_start(), here change to PHY_UP after re-initializing
> > + * PHY hardware, let PHY state machine to start/config auto-nego again.
> > + */
> > + phydev->state = PHY_UP;
> > +
> > no_resume:
> > if (phydev->attached_dev && phydev->adjust_link)
> > phy_start_machine(phydev);
> >