Re: [PATCH v2 net 2/5] net: dsa: be compatible with masters which unregister on shutdown

From: Vladimir Oltean
Date: Fri Sep 13 2024 - 16:29:33 EST


Hi Alexander,

On Wed, Sep 04, 2024 at 08:31:13AM +0000, Sverdlin, Alexander wrote:
> > +static void lan9303_mdio_shutdown(struct mdio_device *mdiodev)
> > +{
> > + struct lan9303_mdio *sw_dev = dev_get_drvdata(&mdiodev->dev);
> > +
> > + if (!sw_dev)
> > + return;
> > +
> > + lan9303_shutdown(&sw_dev->chip);
> > +
> > + dev_set_drvdata(&mdiodev->dev, NULL);
> >  }
>
> This unfortunately didn't work well with LAN9303 and probably will not work
> with others:
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> CPU: 0 PID: 442 Comm: kworker/0:3 Tainted: G O 6.1.99+gitb7793b7d9b35 #1
> Workqueue: events_power_efficient phy_state_machine
> pc : lan9303_mdio_phy_read+0x1c/0x34
> lr : lan9303_phy_read+0x50/0x100
> Call trace:
> lan9303_mdio_phy_read+0x1c/0x34
> lan9303_phy_read+0x50/0x100
> dsa_slave_phy_read+0x40/0x50
> __mdiobus_read+0x34/0x130
> mdiobus_read+0x44/0x70
> genphy_update_link+0x2c/0x104
> genphy_read_status+0x2c/0x120
> phy_check_link_status+0xb8/0xcc
> phy_state_machine+0x198/0x27c
> process_one_work+0x1dc/0x450
> worker_thread+0x154/0x450
>
> as long as the ports are not down (and dsa_switch_shutdown() doesn't ensure it),
> we cannot just zero drvdata, because PHY polling will eventually call
>
> static int lan9303_mdio_phy_read(struct lan9303 *chip, int addr, int reg)
> {
> struct lan9303_mdio *sw_dev = dev_get_drvdata(chip->dev);
>
> return mdiobus_read_nested(sw_dev->device->bus, addr, reg);
>
> There are however multiple other unsafe patterns.
> I suppose current
>
> dsa_switch_shutdown();
> dev_set_drvdata(...->dev, NULL);
>
> pattern is broken in many cases...

Unfortunately the code portion which you've quoted for your reply does not
show the full story. dsa_switch_shutdown(), at the time of this patch,
was implemented like this (stripped of comments):

void dsa_switch_shutdown(struct dsa_switch *ds)
{
struct net_device *master, *slave_dev;
LIST_HEAD(unregister_list);
struct dsa_port *dp;

mutex_lock(&dsa2_mutex);
rtnl_lock();

list_for_each_entry(dp, &ds->dst->ports, list) {
if (dp->ds != ds)
continue;

if (!dsa_port_is_user(dp))
continue;

master = dp->cpu_dp->master;
slave_dev = dp->slave;

netdev_upper_dev_unlink(master, slave_dev);
unregister_netdevice_queue(slave_dev, &unregister_list);
}
unregister_netdevice_many(&unregister_list);

rtnl_unlock();
mutex_unlock(&dsa2_mutex);
}

I believe you would be wrong to blame this patch for exiting with the
slave/user ports still running (and thus ds->ops->phy_read() still
callable), because, as you can see, it doesn't do that - it unregisters
them, which also stops the net_device prior. So, both phylink_stop() and
phylink_destroy() would be called.

The patch had other problems though, and that led to the rework in
commit ee534378f005 ("net: dsa: fix panic when DSA master device unbinds
on shutdown"), rework which is in fact to blame for what you're reporting.

Given that we are talking about a fix to a fix, it doesn't really matter
in terms of backporting targets which one it is, but for correctness sake,
it is the later patch that fixed some things while introducing the race
condition.