Re: [PATCH v2] net: net_failover: Fix the deadlock in slave register

From: faicker mo

Date: Fri May 01 2026 - 22:46:42 EST


>
> > @@ -512,11 +512,11 @@ static int net_failover_slave_register(struct net_device *slave_dev,
> > dev_hold(slave_dev);
> >
> > if (netif_running(failover_dev)) {
> > - err = dev_open(slave_dev, NULL);
> > + err = netif_open(slave_dev, NULL);
>
> Same question for this call. netif_open() -> __dev_open() also does
> netdev_ops_assert_locked(dev), and on the failover_existing_slave_register()
> path the slave's ops lock is not held.
Yes.

>
> Before this change, dev_set_mtu()/dev_open()/dev_close() acquired the ops
> lock internally via netdev_lock_ops(), so the pre-existing-slave path was
> covered. Does switching unconditionally to the non-locking variants
> regress that path for slaves whose drivers use the per-device ops lock,
> both as a lockdep splat when CONFIG_PROVE_LOCKING is set and as a loss of
> ops-lock serialization against concurrent ethtool, queue and netdev-genl
> operations on the slave?
>
> > @@ -562,10 +562,10 @@ static int net_failover_slave_register(struct net_device *slave_dev,
> > err_vlan_add:
> > dev_uc_unsync(slave_dev, failover_dev);
> > dev_mc_unsync(slave_dev, failover_dev);
> > - dev_close(slave_dev);
> > -err_dev_open:
> > + netif_close(slave_dev);
> > +err_netif_open:
> > dev_put(slave_dev);
> > - dev_set_mtu(slave_dev, orig_mtu);
> > + netif_set_mtu(slave_dev, orig_mtu);
>
> Would it be more appropriate to acquire netdev_lock_ops(slave_dev) in
> net_failover_slave_register() when the caller is not the NETDEV_REGISTER
> notifier (where the ops lock is already held), instead of dropping the
> locking unconditionally?
Yes, we should add netdev_lock_ops in failover_existing_slave_register
before calling failover_slave_register.