[REGRESSION PATCH RFC] net: phy: don't resume PHY via MDIO when iface is not up
From: Wolfram Sang
Date: Thu Feb 23 2023 - 02:05:38 EST
TLDR; Commit 96fb2077a517 ("net: phy: consider that suspend2ram may cut
off PHY power") caused regressions for us when resuming an interface
which is not up. It turns out the problem is another one, the above
commit only makes it visible. The attached patch is probably not the
right fix, but at least is proving my assumptions AFAICS.
Setup: I used Renesas boards for my tests, namely Salvator-XS and Ebisu.
They both use RAVB driver (drivers/net/ethernet/renesas/ravb_main.c) and
a Micrel KSZ9031 PHY (drivers/net/phy/micrel.c). I think the problems
are generic, though.
Long text: After the above commit, we could see various resume failures
on our boards, like timeouts when resetting the MDIO bus, or warning
about skew values in non-RGMII mode, although RGMII was used. All of
these happened, because phy_init_hw() was now called in
mdio_bus_phy_resume() which wasn't the case before. But the interface
was not up yet, e.g. phydev->interface was still the default and not
RGMII, so the initialization didn't work properly. phy_attach_direct()
pays attention to this:
1504 /* Do initial configuration here, now that
1505 * we have certain key parameters
1506 * (dev_flags and interface)
1507 */
1508 err = phy_init_hw(phydev);
But phy_init_hw() doesn't if the interface is not up, AFAICS.
This may be a problem in itself, but I then wondered why
mdio_bus_phy_resume() gets called anyhow because the RAVB driver sets
'phydev->mac_managed_pm = true' so once the interface is up
mdio_bus_phy_resume() never gets called. But again, the interface was
not up yet, so mac_managed_pm was not set yet.
So, in my quest to avoid mdio_bus_phy_resume() being called, I tried
this patch declaring the PHY being in suspend state when being probed.
The KSZ9031 has a soft_reset() callback, so phy_init_hw() will reset the
suspended flag when the PHY is attached. It works for me(tm),
suspend/resume now works independently of the interface being up or not.
I don't think this is the proper solution, though. It will e.g. fail if
some PHY is not using the soft_reset() callback. And I am missing the
experience in this subsystem to decide if we can clear the resume flag
in phy_init_hw() unconditionally. My gut feeling is that we can't.
So, this patch mostly demonstrates the issues we have and the things I
found out. I'd be happy if someone could point me to a proper solution,
or more information that I am missing here. Thank you in advance and
happy hacking!
Signed-off-by: Wolfram Sang <wsa+renesas@xxxxxxxxxxxxxxxxxxxx>
---
drivers/net/phy/phy_device.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 8cff61dbc4b5..5cbb471700a8 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -3108,6 +3108,7 @@ static int phy_probe(struct device *dev)
/* Set the state to READY by default */
phydev->state = PHY_READY;
+ phydev->suspended = 1;
out:
/* Assert the reset signal */
--
2.30.2