Re: [PATCH] net: phy: fix PHY possibly unwork after MDIO bus resume back

From: Heiner Kallweit
Date: Tue Apr 06 2021 - 14:43:40 EST


On 06.04.2021 20:32, Florian Fainelli wrote:
>
>
> On 4/6/2021 4:42 AM, Heiner Kallweit wrote:
>>
>> Waiting for ANEG_COMPLETE to be set wouldn't be a good option. Aneg may never
>> complete for different reasons, e.g. no physical link. And even if we use a
>> timeout this may add unwanted delays.
>>
>>> Do you have any other insights that can help me further locate the issue? Thanks.
>>>
>>
>> I think current MAC/PHY PM handling isn't perfect. Often we have the following
>> scenario:
>>
>> *suspend*
>> 1. PHY is suspended (mdio_bus_phy_suspend)
>> 2. MAC suspend callback (typically involving phy_stop())
>>
>> *resume*
>> 1. MAC resume callback (typically involving phy_start())
>> 2. PHY is resumed (mdio_bus_phy_resume), incl. calling phy_init_hw()
>>
>> Calling phy_init_hw() after phy_start() doesn't look right.
>> It seems to work in most cases, but there's a certain risk
>> that phy_init_hw() overwrites something, e.g. the advertised
>> modes.
>> I think we have two valid scenarios:
>>
>> 1. phylib PM callbacks are used, then the MAC driver shouldn't
>> touch the PHY in its PM callbacks, especially not call
>> phy_stop/phy_start.
>>
>> 2. MAC PM callbacks take care also of the PHY. Then I think we would
>> need a flag at the phy_device telling it to make the PHY PM
>> callbacks a no-op.
>
> Maybe part of the problem is that the FEC is calling phy_{stop,start} in
> its suspend/resume callbacks instead of phy_{suspend,resume} which would
> play nice and tell the MDIO bus PM callbacks that the PHY has already
> been suspended.
>
This basically is what I just proposed to test.

> I am also suspicious about whether Wake-on-LAN actually works with the
> FEC, you cannot wake from LAN if the PHY is stopped and powered down.
>
phy_stop() calls phy_suspend() which checks for WoL. Therefore this
should not be a problem.