Re: [PATCH net-next v3] net: phy: Fix suspicious rcu_dereference usage

From: Kory Maincent
Date: Tue Jan 21 2025 - 11:12:16 EST


On Tue, 21 Jan 2025 15:44:34 +0000
Paul Barker <paul.barker.ct@xxxxxxxxxxxxxx> wrote:

> On 21/01/2025 13:01, Kory Maincent wrote:
> > On Tue, 21 Jan 2025 11:34:48 +0000
> > Paul Barker <paul.barker.ct@xxxxxxxxxxxxxx> wrote:
> >
> >> On 21/01/2025 09:38, Kory Maincent wrote:
> [...]
> [...]
> >> [...]
> [...]
> [...]
> >>
> >> (Cc'ing Niklas and Sergey as this relates to the ravb driver)
> >
> > Yes, thanks.
> >
> >> Why do we need to hold the rtnl mutex across the calls to
> >> netif_device_detach() and ravb_wol_setup()?
> >>
> >> My reading of Documentation/networking/netdevices.rst is that the rtnl
> >> mutex is held when the net subsystem calls the driver's ndo_stop method,
> >> which in our case is ravb_close(). So, we should take the rtnl mutex
> >> when we call ravb_close() directly, in both ravb_suspend() and
> >> ravb_wol_restore(). That would ensure that we do not call
> >> phy_disconnect() without holding the rtnl mutex and should fix this
> >> issue.
> >
> > Not sure about it. For example ravb_ptp_stop() called in ravb_wol_setup()
> > won't be protected by the rtnl lock.
>
> ravb_ptp_stop() modifies a couple of device registers and calls
> ptp_clock_unregister(). I don't see anything to suggest that this
> requires the rtnl lock to be held, unless I am missing something.

What happens if two ptp_clock_unregister() with the same ptp_clock pointer are
called simultaneously? From ravb_suspend and ravb_set_ringparam for example. It
may cause some errors.
For example the ptp->kworker pointer could be used after a kfree.
https://elixir.bootlin.com/linux/v6.12.6/source/drivers/ptp/ptp_clock.c#L416

Regards,
--
Köry Maincent, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com