Re: [syzbot] WARNING: refcount bug in nldev_newlink

From: Jason Gunthorpe
Date: Fri Dec 09 2022 - 09:10:36 EST


On Fri, Dec 09, 2022 at 09:42:29PM +0800, Hillf Danton wrote:
> On 9 Dec 2022 09:01:14 -0400 Jason Gunthorpe <jgg@xxxxxxxx>
> > On Thu, Dec 08, 2022 at 11:14:39AM +0200, Leon Romanovsky wrote:
> >
> > > Jason, what do you think?
> >
> > No, the key to this report is that the refcount dec is inside the tracker:
> >
> > > > __refcount_dec include/linux/refcount.h:344 [inline]
> > > > refcount_dec include/linux/refcount.h:359 [inline]
> > > > ref_tracker_free+0x539/0x6b0 lib/ref_tracker.c:118
> > > > netdev_tracker_free include/linux/netdevice.h:4039 [inline]
> >
> > Which is not underflowing the refcount on the dev, it is actually
> > trying to say the tracker has become unbalanced.
> >
> > Eg this put is not matched with a hold that specified the tracker.
> >
> > Probably this:
> >
> > diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> > index ff35cebb25e265..115b77c5e9a146 100644
> > --- a/drivers/infiniband/core/device.c
> > +++ b/drivers/infiniband/core/device.c
> > @@ -2192,6 +2192,7 @@ static void free_netdevs(struct ib_device *ib_dev)
> > if (ndev) {
> > spin_lock(&ndev_hash_lock);
> > hash_del_rcu(&pdata->ndev_hash_link);
> > + netdev_tracker_free(ndev, &pdata->netdev_tracker);
> > spin_unlock(&ndev_hash_lock);
> >
> > /*
> > @@ -2201,7 +2202,7 @@ static void free_netdevs(struct ib_device *ib_dev)
> > * comparisons after the put
> > */
> > rcu_assign_pointer(pdata->netdev, NULL);
> > - dev_put(ndev);
> > + __dev_put(ndev);
> > }
> > spin_unlock_irqrestore(&pdata->netdev_lock, flags);
> > }
>
> Wonder why this makes sense given rcu_assign_pointer(pdata->netdev, NULL)
> under pdata->netdev_lock.

Oh, yah, that is right, so we can just do the natural thing:

rcu_assign_pointer(pdata->netdev, NULL);
- dev_put(ndev);
+ netdev_put(ndev, &pdata->netdev_tracker);


Jason