Re: [RFC PATCH net v2] net: phy: Fix the issue that netif always links up after resuming

From: Heiner Kallweit
Date: Mon Dec 03 2018 - 02:21:08 EST


On 03.12.2018 05:35, Kunihiko Hayashi wrote:
> Hi Florian Heiner,
>
> On Fri, 30 Nov 2018 19:47:37 +0100 <hkallweit1@xxxxxxxxx> wrote:
>
>> On 30.11.2018 18:46, Florian Fainelli wrote:
>>>
>>>
>>> On 11/30/2018 1:25 AM, Kunihiko Hayashi wrote:
>>>> Even though the link is down before entering hibernation,
>>>> there is an issue that the network interface always links up after resuming
>>>> from hibernation.
>>>>
>>>> The phydev->state is PHY_READY before enabling the network interface, so
>>>> the link is down. After resuming from hibernation, the phydev->state is
>>>> forcibly set to PHY_UP in mdio_bus_phy_restore(), and the link becomes up.
>>>>
>>>> This patch adds a new convenient function to check whether the PHY is in
>>>> a started state, and expects to solve the issue by changing phydev->state
>>>> to PHY_UP and calling phy_start_machine() only when the PHY is started.
>>>>
>>>> Suggested-by: Heiner Kallweit <hkallweit1@xxxxxxxxx>
>>>> Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@xxxxxxxxxxxxx>
>>>> ---
>>>> drivers/net/phy/phy.c | 2 +-
>>>> drivers/net/phy/phy_device.c | 9 +++++----
>>>> include/linux/phy.h | 10 ++++++++++
>>>> 3 files changed, 16 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
>>>> index 1d73ac3..f484d03 100644
>>>> --- a/drivers/net/phy/phy.c
>>>> +++ b/drivers/net/phy/phy.c
>>>> @@ -670,7 +670,7 @@ void phy_stop_machine(struct phy_device *phydev)
>>>> cancel_delayed_work_sync(&phydev->state_queue);
>>>>
>>>> mutex_lock(&phydev->lock);
>>>> - if (phydev->state > PHY_UP && phydev->state != PHY_HALTED)
>>>> + if (phy_is_started(phydev))
>>>> phydev->state = PHY_UP;
>>>> mutex_unlock(&phydev->lock);
>>>> }
>>>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>>>> index ab33d17..2c39717 100644
>>>> --- a/drivers/net/phy/phy_device.c
>>>> +++ b/drivers/net/phy/phy_device.c
>>>> @@ -309,10 +309,11 @@ static int mdio_bus_phy_restore(struct device *dev)
>>>> return ret;
>>>>
>>>> /* The PHY needs to renegotiate. */
>>>> - phydev->link = 0;
>>>> - phydev->state = PHY_UP;
>>>> -
>>>> - phy_start_machine(phydev);
>>>> + if (phy_is_started(phydev)) {
>>>> + phydev->link = 0;
>>>> + phydev->state = PHY_UP;
>>>> + phy_start_machine(phydev);
>>>> + }
>>>
>>> Don't you need some of these steps to be performed under phydev->lock
>>> being held? See comment below.
>>>
>> Yes, locking should be done. The old code sets phydev->state
>> w/o holding the lock, I'd says this was wrong.
>
> Indeed. The phydev->state should be set with locking the mutex even here.
>
> And it seems that setting phydev->link and calling phy_start_machine() don't
> need to hold the lock. Is it correct?
>
Yes

>>>>
>>>> return 0;
>>>> }
>>>> diff --git a/include/linux/phy.h b/include/linux/phy.h
>>>> index 3ea87f7..c194b45 100644
>>>> --- a/include/linux/phy.h
>>>> +++ b/include/linux/phy.h
>>>> @@ -898,6 +898,16 @@ static inline bool phy_is_pseudo_fixed_link(struct phy_device *phydev)
>>>> }
>>>>
>>>> /**
>>>> + * phy_is_started - Convenience function for testing whether a PHY is in
>>>> + * a started state
>>>> + * @phydev: the phy_device struct
>>>> + */
>>>> +static inline bool phy_is_started(struct phy_device *phydev)
>>>> +{
>>>
>>> An assert with the phydev->lock mutex being held here would greatly
>>> help, because otherwise this is possibly racy.
>>>
>> Have a look at __phy_resume() to see what is meant with this comment.
>
> I see. I found that there was a lock detection in this function.
> The phy_is_started() should have the same detection, shouldn't it?
>
Correct

> Thank you,
>
>>>> + return phydev->state >= PHY_UP && phydev->state != PHY_HALTED;
>>>> +}
>>>> +
>>>> +/**
>>>> * phy_write_mmd - Convenience function for writing a register
>>>> * on an MMD on a given PHY.
>>>> * @phydev: The phy_device struct
>>>>
>>>
>
> ---
> Best Regards,
> Kunihiko Hayashi
>
Heiner
>
>