Re: [PATCH] kobject: Make sure the parent does not get released before its children

From: Greg Kroah-Hartman
Date: Sun May 24 2020 - 11:35:09 EST


On Sun, May 24, 2020 at 03:28:12PM +0200, Greg Kroah-Hartman wrote:
> On Sun, May 24, 2020 at 03:14:05PM +0200, Greg Kroah-Hartman wrote:
> > On Sun, May 24, 2020 at 02:57:27PM +0200, Greg Kroah-Hartman wrote:
> > > On Sat, May 23, 2020 at 08:44:06AM -0700, Randy Dunlap wrote:
> > > > On 5/23/20 8:36 AM, Greg Kroah-Hartman wrote:
> > > > > On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> > > > >> In the function kobject_cleanup(), kobject_del(kobj) is
> > > > >> called before the kobj->release(). That makes it possible to
> > > > >> release the parent of the kobject before the kobject itself.
> > > > >>
> > > > >> To fix that, adding function __kboject_del() that does
> > > > >> everything that kobject_del() does except release the parent
> > > > >> reference. kobject_cleanup() then calls __kobject_del()
> > > > >> instead of kobject_del(), and separately decrements the
> > > > >> reference count of the parent kobject after kobj->release()
> > > > >> has been called.
> > > > >>
> > > > >> Reported-by: Naresh Kamboju <naresh.kamboju@xxxxxxxxxx>
> > > > >> Reported-by: kernel test robot <rong.a.chen@xxxxxxxxx>
> > > > >> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> > > > >> Suggested-by: "Rafael J. Wysocki" <rafael@xxxxxxxxxx>
> > > > >> Signed-off-by: Heikki Krogerus <heikki.krogerus@xxxxxxxxxxxxxxx>
> > > > >> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > > > >> Reviewed-by: Brendan Higgins <brendanhiggins@xxxxxxxxxx>
> > > > >> Tested-by: Brendan Higgins <brendanhiggins@xxxxxxxxxx>
> > > > >> Acked-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>
> > > > >> ---
> > > > >> lib/kobject.c | 30 ++++++++++++++++++++----------
> > > > >> 1 file changed, 20 insertions(+), 10 deletions(-)
> > > > >
> > > > > Stepping back, now that it turns out this patch causes more problems
> > > > > than it fixes, how is everyone reproducing the original crash here?
> > > >
> > > > Just load lib/test_printf.ko and boom!
> > >
> > > Thanks, that helps.
> > >
> > > Ok, in messing around with the kobject core more, originally we thought
> > > this was an issue of the kobject uevent happening for the parent pointer
> > > (when the parent was invalid). so, moving things around some more, and
> > > now I'm crashing in software_node_release() when we are trying to access
> > > swnode->parent->child_ids as parent is invalid there.
> > >
> > > So I feel like this is a swnode bug, or a use of swnode in a way it
> > > shouldn't be that the testing framework is exposing somehow.
> > >
> > > Let me dig deeper...
> >
> > Ah, ick, static software nodes trying to be cleaned up in the totally
> > wrong order. You can't just try to randomly clean up a kobject anywhere
> > in the middle of the hierarchy, that's flat out not going to work
> > properly. let me unwind it...
>
> Ok, the patch below fixes the test, there's not really anything wrong
> with the kobject core, except maybe the kobject uevent for removal,
> which I'll send a patch for.
>
> I'll write these up as a real set of patches after a bit.

They are now here:
https://lore.kernel.org/lkml/20200524153041.2361-1-gregkh@xxxxxxxxxxxxxxxxxxx/

thanks,

greg k-h