Re: [PATCH] kobject: Make sure the parent does not get released before its children
From: Dmitry Torokhov
Date: Sat May 23 2020 - 15:04:47 EST
On Sat, May 23, 2020 at 8:48 AM Randy Dunlap <rdunlap@xxxxxxxxxxxxx> wrote:
>
> On 5/23/20 8:36 AM, Greg Kroah-Hartman wrote:
> > On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> >> In the function kobject_cleanup(), kobject_del(kobj) is
> >> called before the kobj->release(). That makes it possible to
> >> release the parent of the kobject before the kobject itself.
> >>
> >> To fix that, adding function __kboject_del() that does
> >> everything that kobject_del() does except release the parent
> >> reference. kobject_cleanup() then calls __kobject_del()
> >> instead of kobject_del(), and separately decrements the
> >> reference count of the parent kobject after kobj->release()
> >> has been called.
> >>
> >> Reported-by: Naresh Kamboju <naresh.kamboju@xxxxxxxxxx>
> >> Reported-by: kernel test robot <rong.a.chen@xxxxxxxxx>
> >> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> >> Suggested-by: "Rafael J. Wysocki" <rafael@xxxxxxxxxx>
> >> Signed-off-by: Heikki Krogerus <heikki.krogerus@xxxxxxxxxxxxxxx>
> >> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >> Reviewed-by: Brendan Higgins <brendanhiggins@xxxxxxxxxx>
> >> Tested-by: Brendan Higgins <brendanhiggins@xxxxxxxxxx>
> >> Acked-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>
> >> ---
> >> lib/kobject.c | 30 ++++++++++++++++++++----------
> >> 1 file changed, 20 insertions(+), 10 deletions(-)
> >
> > Stepping back, now that it turns out this patch causes more problems
> > than it fixes, how is everyone reproducing the original crash here?
>
> Just load lib/test_printf.ko and boom!
>
>
> > Is it just the KUNIT_DRIVER_PE_TEST that is causing the issue?
> >
> > In looking at 7589238a8cf3 ("Revert "software node: Simplify
> > software_node_release() function""), the log messages there look
> > correct. sysfs can't create a duplicate file, and so when your test is
> > written to try to create software nodes, you always have to check the
> > return value. If you run the test in parallel, or before another test
> > has had a chance to clean up, the function will fail, correctly.
> >
> > So what real-world thing is this test "failure" trying to show?
Well, not sure about the test, but speaking more generally, should not
we postpone releasing parent's reference until we are in
kobj->release() handler? I.e. after all child state is cleared, and
all memory is freed, _then_ we unpin the parent?
Thanks.
--
Dmitry