RE: [PATCH] net: stmmac: don't stop RXC during LPI

From: Joakim Zhang
Date: Mon Jan 24 2022 - 01:26:12 EST



Hi Jisheng,

> -----Original Message-----
> From: Jisheng Zhang <jszhang@xxxxxxxxxx>
> Sent: 2022年1月24日 0:10
> To: Andrew Lunn <andrew@xxxxxxx>; Joakim Zhang
> <qiangqing.zhang@xxxxxxx>
> Cc: Giuseppe Cavallaro <peppe.cavallaro@xxxxxx>; Alexandre Torgue
> <alexandre.torgue@xxxxxxxxxxx>; Jose Abreu <joabreu@xxxxxxxxxxxx>;
> David S . Miller <davem@xxxxxxxxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>;
> Maxime Coquelin <mcoquelin.stm32@xxxxxxxxx>; netdev@xxxxxxxxxxxxxxx;
> linux-stm32@xxxxxxxxxxxxxxxxxxxxxxxxxxxx;
> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH] net: stmmac: don't stop RXC during LPI
>
> On Mon, Jan 24, 2022 at 12:08:22AM +0800, Jisheng Zhang wrote:
> > On Sun, Jan 23, 2022 at 04:52:29PM +0100, Andrew Lunn wrote:
> > > On Sun, Jan 23, 2022 at 10:12:45PM +0800, Jisheng Zhang wrote:
> > > > I met can't receive rx pkt issue with below steps:
> > > > 0.plug in ethernet cable then boot normal and get ip from dhcp
> > > > server 1.quickly hotplug out then hotplug in the ethernet cable
> > > > 2.trigger the dhcp client to renew lease
> > > >
> > > > tcpdump shows that the request tx pkt is sent out successfully,
> > > > but the mac can't receive the rx pkt.
> > > >
> > > > The issue can easily be reproduced on platforms with PHY_POLL
> > > > external phy. If we don't allow the phy to stop the RXC during
> > > > LPI, the issue is gone. I think it's unsafe to stop the RXC during
> > > > LPI because the mac needs RXC clock to support RX logic.
> > > >
> > > > And the 2nd param clk_stop_enable of phy_init_eee() is a bool, so
> > > > use false instead of 0.
> > > >
> > > > Signed-off-by: Jisheng Zhang <jszhang@xxxxxxxxxx>
> > > > ---
> > > > drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > > b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > > index 6708ca2aa4f7..92a9b0b226b1 100644
> > > > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > > @@ -1162,7 +1162,7 @@ static void stmmac_mac_link_up(struct
> > > > phylink_config *config,
> > > >
> > > > stmmac_mac_set(priv, priv->ioaddr, true);
> > > > if (phy && priv->dma_cap.eee) {
> > > > - priv->eee_active = phy_init_eee(phy, 1) >= 0;
> > > > + priv->eee_active = phy_init_eee(phy, false) >= 0;
> > >
> > > This has not caused issues in the past. So i'm wondering if this is
> > > somehow specific to your system? Does everybody else use a PHY which
> > > does not implement this bit? Does your synthesis of the stmmac have
> > > a different clock tree?
> > >
> > > By changing this value for every instance of the stmmac, you are
> > > potentially causing a power regression for stmmac implementations
> > > which don't need the clock. So we need a clear understanding,
> > > stopping the clock is wrong in general and so the change is correct
> > > in
> >
> > I think this is a common issue because the MAC needs phy's RXC for RX
> > logic. But it's better to let other stmmac users verify. The issue can
> > easily be reproduced on platforms with PHY_POLL external phy.
> > Or other platforms use a dedicated clock rather than clock from phy
> > for MAC's RX logic?
> >
> > If the issue turns out specific to my system, then I will send out a
> > new patch to adopt your suggestion.
> >
>
> + Joakim
>
> > Hi Joakim, IIRC, you have stmmac + external RTL8211F phy platform, but
> > I'm not sure whether your platform have an irq for the phy. could you
> > help me to check whether you can reproduce the issue on your platform?

Yes, i.MX8MP uses the stmmac + external RTL8211F which works on PHY_POLL mode.
I tried the reproduce steps you provided, but the Ethernet can work properly. I don't know what abnormal behavior should appear obviously?

Regarding to your reported issue, I guess this is a real general issue for SNPS stmmac working on RGMII mode, not sure if other mii modes also suffer similar issue.
Actually we have a same patch for it at local since 5.10 to fix a suspend/resume issue.
https://source.codeaurora.org/external/imx/linux-imx/commit/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c?h=lf-5.10.y&id=a7864e9fbc8f8f99e785ce7419a682651c89b8f7

The root cause is that stmmac needs RXC clock generating from PHY for some receive logic, if RXC clock is not feeding in time, stmmac would be broken. And the feedback from SNPS guys, they confirm that stmmac needs this RXC clock and there is no other clocks can be routed if RXC is not present.

Best Regards,
Joakim Zhang