Re: [PATCH 2/2] kobject: send KOBJ_REMOVE uevent when the object is removed from sysfs
From: Rafael J. Wysocki
Date: Wed May 27 2020 - 04:35:06 EST
On Wed, May 27, 2020 at 9:50 AM Heikki Krogerus
<heikki.krogerus@xxxxxxxxxxxxxxx> wrote:
>
> On Tue, May 26, 2020 at 10:26:23AM +0200, Rafael J. Wysocki wrote:
> > On Tue, May 26, 2020 at 7:58 AM Greg Kroah-Hartman
> > <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Mon, May 25, 2020 at 03:49:01PM -0700, Dmitry Torokhov wrote:
> > > > On Sun, May 24, 2020 at 8:34 AM Greg Kroah-Hartman
> > > > <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > It is possible for a KOBJ_REMOVE uevent to be sent to userspace way
> > > > > after the files are actually gone from sysfs, due to how reference
> > > > > counting for kobjects work. This should not be a problem, but it would
> > > > > be good to properly send the information when things are going away, not
> > > > > at some later point in time in the future.
> > > > >
> > > > > Before this move, if a kobject's parent was torn down before the child,
> > > >
> > > > ^^^^ And this is the root of the problem and what has to be fixed.
> > >
> > > I fixed that in patch one of this series. Turns out the user of the
> > > kobject was not even expecting that to happen.
> > >
> > > > > when the call to kobject_uevent() happened, the parent walk to try to
> > > > > reconstruct the full path of the kobject could be a total mess and cause
> > > > > crashes. It's not good to try to tear down a kobject tree from top
> > > > > down, but let's at least try to not to crash if a user does so.
> > > >
> > > > One can try, but if we keep proper reference counting then kobject
> > > > core should take care of actually releasing objects in the right
> > > > order. I do not think you should keep this patch, and instead see if
> > > > we can push call to kobject_put(kobj->parent) into kobject_cleanup().
> > >
> > > I tried that, but there was a _lot_ of underflow errors reported, so
> > > there's something else happening. Or my attempt was incorrect :)
> >
> > So it looks like there is something in there that's been overlooked so far.
> >
> > I'll try to look at the Guenter's traces and figure out what went
> > wrong after the Heikki's patch.
>
> At least one problem with that patch was that I was releasing the
> parent reference unconditionally.
That actually may be sufficient to explain all of the problems introduced by it.