Re: WARNING in ib_umad_kill_port

From: Jason Gunthorpe
Date: Tue Apr 07 2020 - 10:35:35 EST


On Tue, Apr 07, 2020 at 02:39:42PM +0200, Dmitry Vyukov wrote:
> On Tue, Apr 7, 2020 at 1:55 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> >
> > On Tue, Apr 07, 2020 at 11:56:30AM +0200, Dmitry Vyukov wrote:
> > > > I'm not sure what could be done wrong here to elicit this:
> > > >
> > > > sysfs group 'power' not found for kobject 'umad1'
> > > >
> > > > ??
> > > >
> > > > I've seen another similar sysfs related trigger that we couldn't
> > > > figure out.
> > > >
> > > > Hard to investigate without a reproducer.
> > >
> > > Based on all of the sysfs-related bugs I've seen, my bet would be on
> > > some races. E.g. one thread registers devices, while another
> > > unregisters these.
> >
> > I did check that the naming is ordered right, at least we won't be
> > concurrently creating and destroying umadX sysfs of the same names.
> >
> > I'm also fairly sure we can't be destroying the parent at the same
> > time as this child.
> >
> > Do you see the above commonly? Could it be some driver core thing? Or
> > is it more likely something wrong in umad?
>
> Mmmm... I can't say, I am looking at some bugs very briefly. I've
> noticed that sysfs comes up periodically (or was it some other similar
> fs?).

Hmm..

Looking at the git history I see several cases where there are
ordering problems. I wonder if the rdma parent device is being
destroyed before the rdma devices complete destruction?

I see the syzkaller is creating a bunch of virtual net devices, and I
assume it has created a software rdma device on one of these virtual
devices.

So I'm guessing that it is also destroying a parent? But I can't guess
which.. Some simple tests with veth suggest it is OK because the
parent is virtual. But maybe bond or bridge or something?

The issue in rdma is that unregistering a netdev triggers an async
destruction of the RDMA devices. This has to be async because the
netdev notification is delivered with RTNL held, and a rdma device
cannot be destroyed while holding RTNL.

So there is a race, I suppose, where the netdev can complete
destruction while rdma continues, and if someone deletes the sysfs
holding the netdev before rdma completes, I'm going to guess, that we
hit this warning?

Could it be? I would love to know what netdev the rdma device was
created on, but it doesn't seem to show in the trace :\

This theory could be made more likely by adding a sleep to
ib_unregister_work() to increase the race window - is there some way
to get syzkaller to search for a reproducer with that patch?

Jason