Re: [PATCH net v3 2/2] net: fec: Suspend the PHY on probe

From: John Ernberg
Date: Wed Mar 20 2024 - 11:26:40 EST


Hi Russel,

On 3/19/24 09:51, Russell King (Oracle) wrote:
> On Tue, Mar 19, 2024 at 08:37:44AM +0000, John Ernberg wrote:
>> There is also a case where the phy driver module is not automatically
>> loaded, in cases where request_module() fails, either due to the
>> userspace helper feature being compiled out or other reasons, and the
>> module is loaded manually later. I suspect for reasons like these the
>> genphy probe happens so late. My solution here doesn't cover non-loaded
>> modules either, but this could maybe be covered by moving phy_suspend()
>> to phy_probe(). Unless there is an even more clever way to go about it
>> which I can't see from inexperience.
>
> Note that in the case where the PHY driver module is loaded late,
> phy_probe() won't be called for the PHY until that happens.
>
> I would say if one wants a platform to behave with minimal power
> consumption, that is something that has to be done across the
> software stack, and that includes the boot firmware. So, if one
> wants the PHY to be in a low power state at boot time, then
> firmware needs to ensure that happens.
>
> Trying to shoe-horn that into the kernel isn't going to work
> because we get to decide what to do with the PHY way too late
> (due to PHY drivers being modular and on the rootfs.)
>

What we really want is the PHY to be suspended on suspend to RAM
regardless of
us having had an initial link up or not.

This worked prior to 4c0d2e96ba05 ("net: phy: consider that suspend2ram
may cut
off PHY power") which was added in Linux 5.11, and 557d5dc83f68 ("net:
fec: use
mac-managed PHY PM") which was added in Linux 5.12.

Since FEC requires mac_managed_pm the generic PM suspend-resume paths
are not
taken. The resume sequencing with generic PM has been broken with the
FEC since
generic PM of the mdio bus was added, as the FEC will do phy_start()
(via FEC
resume) and then generic PM runs phy_init_hw() via mdio bus resume
(previously:
less damaging phy_resume()) due to how the FEC IP block works.

Some background context to our usecase which might have been lost is
that our
system bring the link up based on outside input and to conserve power we
suspend
regularly, and this is the only situation where we care about the power
consumption. Since we cannot decide if link shall be up ourselves we can go
through numerous suspend cycles before the first link up. We could in theory
work around it in userspace by doing "ip link set <eth> up && ip link
set <eth>
down", but it wasn't required before.

Thanks! // John Ernberg