Re: WARNING in ib_umad_kill_port

From: Greg Kroah-Hartman
Date: Tue Apr 07 2020 - 10:33:11 EST


On Tue, Apr 07, 2020 at 02:39:42PM +0200, Dmitry Vyukov wrote:
> On Tue, Apr 7, 2020 at 1:55 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> >
> > On Tue, Apr 07, 2020 at 11:56:30AM +0200, Dmitry Vyukov wrote:
> > > > I'm not sure what could be done wrong here to elicit this:
> > > >
> > > > sysfs group 'power' not found for kobject 'umad1'
> > > >
> > > > ??
> > > >
> > > > I've seen another similar sysfs related trigger that we couldn't
> > > > figure out.
> > > >
> > > > Hard to investigate without a reproducer.
> > >
> > > Based on all of the sysfs-related bugs I've seen, my bet would be on
> > > some races. E.g. one thread registers devices, while another
> > > unregisters these.
> >
> > I did check that the naming is ordered right, at least we won't be
> > concurrently creating and destroying umadX sysfs of the same names.
> >
> > I'm also fairly sure we can't be destroying the parent at the same
> > time as this child.
> >
> > Do you see the above commonly? Could it be some driver core thing? Or
> > is it more likely something wrong in umad?
>
> Mmmm... I can't say, I am looking at some bugs very briefly. I've
> noticed that sysfs comes up periodically (or was it some other similar
> fs?). General observation is that code frequently assumes only the
> happy scenario and only, say, a single administrator doing one thing
> at a time, slowly and carefully, and it is not really hardened against
> armies of monkeys.
> But I did not look at code abstractions, bug patterns, contracts, etc.
>
> Greg KH may know better. Greg, as far as I remember you commented on
> some of these reports along the lines of, for example, "the warning is
> in sysfs code, but the bug is in the callers".

Yes, that is correct.