Re: kobject_init_and_add() confusion

From: Rafael J. Wysocki
Date: Fri May 10 2019 - 03:53:34 EST


On Fri, May 10, 2019 at 4:36 AM Tobin C. Harding <me@xxxxxxxx> wrote:
>
> On Wed, May 01, 2019 at 09:54:16AM +0200, Rafael J. Wysocki wrote:
> > On Wed, May 1, 2019 at 1:38 AM Tobin C. Harding <me@xxxxxxxx> wrote:
> > >
> > > Hi,
> > >
> > > Looks like I've created a bit of confusion trying to fix memleaks in
> > > calls to kobject_init_and_add(). Its spread over various patches and
> > > mailing lists so I'm starting a new thread and CC'ing anyone that
> > > commented on one of those patches.
> > >
> > > If there is a better way to go about this discussion please do tell me.
> > >
> > > The problem
> > > -----------
> > >
> > > Calls to kobject_init_and_add() are leaking memory throughout the kernel
> > > because of how the error paths are handled.
> > >
> > > The solution
> > > ------------
> > >
> > > Write the error path code correctly.
> > >
> > > Example
> > > -------
> > >
> > > We have samples/kobject/kobject-example.c but it uses
> > > kobject_create_and_add(). I thought of adding another example file here
> > > but could not think of how to do it off the top of my head without being
> > > super contrived. Can add this to the TODO list if it will help.
> > >
> > > Here is an attempted canonical usage of kobject_init_and_add() typical
> > > of the code that currently is getting it wrong. This is the second time
> > > I've written this and the first time it was wrong even after review (you
> > > know who you are, you are definitely buying the next round of drinks :)
> > >
> > >
> > > Assumes we have an object in memory already that has the kobject
> > > embedded in it. Variable 'kobj' below would typically be &ptr->kobj
> > >
> > >
> > > void fn(void)
> > > {
> > > int ret;
> > >
> > > ret = kobject_init_and_add(kobj, ktype, NULL, "foo");
> > > if (ret) {
> > > /*
> > > * This means kobject_init() has succeeded
> > > * but kobject_add() failed.
> > > */
> > > goto err_put;
> > > }
> > >
> > > ret = some_init_fn();
> > > if (ret) {
> > > /*
> > > * We need to wind back kobject_add() AND kobject_put().
> >
> > kobject_add() and kobject_init() I suppose?
> >
> > > * kobject_add() incremented the refcount in
> > > * kobj->parent, that needs to be decremented THEN we need
> > > * the call to kobject_put() to decrement the refcount of kobj.
> > > */
> >
> > So actually, if you look at kobject_cleanup(), it calls kobject_del()
> > if kobj->state_in_sysfs is set.
> >
> > Now, if you look at kobject_add_internal(), it sets
> > kobj->state_in_sysfs when about to return 0 (success).
> >
> > Therefore calling kobject_put() without the preceding kobject_del() is
> > not a bug technically, even though it will trigger the "auto cleanup
> > kobject_del" message with debug enabled.
> >
> > > goto err_del;
> > > }
> > >
> > > ret = some_other_init_fn();
> > > if (ret)
> > > goto other_err;
> > >
> > > kobject_uevent(kobj, KOBJ_ADD);
> > > return 0;
> > >
> > > other_err:
> > > other_clean_up_fn();
> > > err_del:
> > > kobject_del(kobj);
> > > err_put:
> > > kobject_put(kobj);
> > >
> > > return ret;
> > > }
> > >
> > >
> > > Have I got this correct?
> > >
> > > TODO
> > > ----
> > >
> > > - Fix all the callsites to kobject_init_and_add()
> > > - Further clarify the function docstring for kobject_init_and_add() [perhaps]
> > > - Add a section to Documentation/kobject.txt [optional]
> > > - Add a sample usage file under samples/kobject [optional]
> >
> > The plan sounds good to me, but there is one thing to note IMO:
> > kobject_cleanup() invokes the ->release() callback for the ktype, so
> > these callbacks need to be able to cope with kobjects after a failing
> > kobject_add() which may not be entirely obvious to developers
> > introducing them ATM.
>
> It has taken a while for this to soak in. This is actually quite an
> insidious issue. If I give an example and perhaps we can come to a
> solution. This example is based on the code (and assumptions) in
> mm/slub.c
>
> If a developer has an object that they wish to add to sysfs they go
> ahead and embed a kobject in it. Correctly set up a ktype including
> release function that just frees the object (using container of). Now
> assume that the object is already set up and in use when we go to set up
> the sysfs entry. If kobject_init_and_add() fails and we correctly call
> kobject_put() the containing object will be free'd. Yet the calling
> code may not be done with the object, more to the point just because
> sysfs setup fails the object is now unusable. Besides the interesting
> theoretical discussion this means we cannot just go and willy-nilly add
> calls to kobject_put() in the error path of kobject_init_and_add() if
> the original code was not written under the assumption that the release
> method could be called during the error path (I have found 2 places at
> least where behaviour of calling the release method is non-trivial to
> ascertain).

Well, generally speaking, you can add kobject_put() somewhere only if
invoking the ->release() callback for the ktype of the kobject in
question is safe at that point.

Now, it may be unsafe for two reasons: there may be active references
to the memory that would be freed by ->release() along with the
kobject (as you said above) or the code in ->release() may have
assumed some initialization to take place before it runs and so
->release() should not be invoked before completing that
initialization.

> I guess, as Greg said, its just a matter that reference counting within
> the kernel is a hard problem. So we fix the easy ones and then look a
> bit harder at the hard ones ...
>
> Any better suggestion?

Well, I would say we fix all of them, but we are careful enough to
understand what's going on.

And the more complex cases will take more work (and time) to fix. As
always. :-)