Re: [PATCH net] net: phy: Avoid multiple suspends

From: Heiner Kallweit
Date: Tue Mar 10 2020 - 13:34:55 EST


On 10.03.2020 17:46, Florian Fainelli wrote:
> On 3/10/20 7:16 AM, Geert Uytterhoeven wrote:
>> Hi Florian, David,
>>
>> On Mon, Feb 24, 2020 at 5:59 AM David Miller <davem@xxxxxxxxxxxxx> wrote:
>>> From: Florian Fainelli <f.fainelli@xxxxxxxxx>
>>> Date: Thu, 20 Feb 2020 15:34:53 -0800
>>>
>>>> It is currently possible for a PHY device to be suspended as part of a
>>>> network device driver's suspend call while it is still being attached to
>>>> that net_device, either via phy_suspend() or implicitly via phy_stop().
>>>>
>>>> Later on, when the MDIO bus controller get suspended, we would attempt
>>>> to suspend again the PHY because it is still attached to a network
>>>> device.
>>>>
>>>> This is both a waste of time and creates an opportunity for improper
>>>> clock/power management bugs to creep in.
>>>>
>>>> Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY")
>>>> Signed-off-by: Florian Fainelli <f.fainelli@xxxxxxxxx>
>>>
>>> Applied, and queued up for -stable, thanks Florian.
>>
>> This patch causes a regression on r8a73a4/ape6evm and sh73a0/kzm9g.
>> After resume from s2ram, Ethernet no longer works:
>>
>> PM: suspend exit
>> nfs: server aaa.bbb.ccc.ddd not responding, still trying
>> ...
>>
>> Reverting commit 503ba7c6961034ff ("net: phy: Avoid multiple suspends")
>> fixes the issue.
>>
>> On both boards, an SMSC LAN9220 is connected to a power-managed local
>> bus.
>>
>> I added some debug code to check when the clock driving the local bus
>> is stopped and started, but I see no difference before/after. Hence I
>> suspect the Ethernet chip is no longer reinitialized after resume.
>
> Can you provide a complete log? Do you use the Generic PHY driver or a
> specialized one? Do you have a way to dump the registers at the time of
> failure and see if BMCR.PDOWN is still set somehow?
>
Maybe reason for the misbehavior is that mdio_bus_phy_may_suspend() is
checked also in mdio_bus_phy_resume(), what's not very logical based
on the naming. The call to phy_resume() therefore may be skipped.


> Does the following help:
>
> diff --git a/drivers/net/ethernet/smsc/smsc911x.c
> b/drivers/net/ethernet/smsc/smsc911x.c
> index 49a6a9167af4..df17190c76c0 100644
> --- a/drivers/net/ethernet/smsc/smsc911x.c
> +++ b/drivers/net/ethernet/smsc/smsc911x.c
> @@ -2618,6 +2618,7 @@ static int smsc911x_resume(struct device *dev)
> if (netif_running(ndev)) {
> netif_device_attach(ndev);
> netif_start_queue(ndev);
> + phy_resume(dev->phydev);
> }
>
> return 0;
>