Re: [PATCH net-next v2 2/2] net: phy: marvell-88q2xxx: Prevent hwmon access with asserted reset
From: Dimitri Fedrau
Date: Thu Feb 20 2025 - 16:06:07 EST
Am Thu, Feb 20, 2025 at 03:29:10PM +0000 schrieb Russell King (Oracle):
> On Thu, Feb 20, 2025 at 04:22:14PM +0100, Dimitri Fedrau wrote:
> > Hi Russell,
> >
> > Am Thu, Feb 20, 2025 at 09:36:49AM +0000 schrieb Russell King (Oracle):
> > > On Thu, Feb 20, 2025 at 09:11:12AM +0100, Dimitri Fedrau wrote:
> > > > If the PHYs reset is asserted it returns 0xffff for any read operation.
> > > > This might happen if the user admins down the interface and wants to read
> > > > the temperature. Prevent reading the temperature in this case and return
> > > > with an network is down error. Write operations are ignored by the device
> > > > when reset is asserted, still return a network is down error in this
> > > > case to make the user aware of the operation gone wrong.
> > >
> > > If we look at where mdio_device_reset() is called from:
> > >
> > > 1. mdio_device_register() -> mdiobus_register_device() asserts reset
> > > before adding the device to the device layer (which will then
> > > cause the driver to be searched for and bound.)
> > >
> > > 2. mdio_probe(), deasserts the reset signal before calling the MDIO
> > > driver's ->probe method, which will be phy_probe().
> > >
> > > 3. after a probe failure to re-assert the reset signal.
> > >
> > > 4. after ->remove has been called.
> > >
> >
> > There is also phy_device_reset that calls mdio_device_reset.
>
> Ok, thanks for pointing that out.
>
> > > That is the sum total. So, while the driver is bound to the device,
> > > phydev->mdio.reset_state is guaranteed to be zero.
> > >
> > > Therefore, is this patch fixing a real observed problem with the
> > > current driver?
> > >
> > Yes, when I admin up and afterwards down the network device then the PHYs
> > reset is asserted. In this case phy_detach is called which calls
> > phy_device_reset(phydev, 1), ...
>
> I'm still concerned that this solution is basically racy - the
> netdev can come up/down while hwmon is accessing the device. I'm
> also unconvinced that ENETDOWN is a good idea here.
>
Yes it is racy. A solution would be to set a flag or whatever in
mdio_device_reset right before changes to the reset line happens and
clear the flag right after the changes have been done. Add a function
that return the state of the flag.
Better go back to EIO ? At first I thought it was a good idea because
the user gets at least a hint what the cause of the error is...
> While I get the "describe the hardware" is there a real benefit to
> describing this?
>
I can't follow.
> What I'm wondering is whether manipulating the reset signal in this
> case provides more pain than gain.
>
It seems so. I wasn't really aware of it :-)
Best regards,
Dimitri Fedrau