Re: memleak around kobject_init_and_add()

From: Tobin C. Harding
Date: Sat Apr 27 2019 - 19:34:26 EST


On Sat, Apr 27, 2019 at 09:28:09PM +0200, Greg Kroah-Hartman wrote:
> On Sat, Apr 27, 2019 at 06:13:30PM +1000, Tobin C. Harding wrote:
> > (Note at bottom on reasons for 'To' list 'Cc' list)
> >
> > Hi,
> >
> > kobject_init_and_add() seems to be routinely misused. A failed call to this
> > function requires a call to kobject_put() otherwise we leak memory.
> >
> > Examples memleaks can be seen in:
> >
> > mm/slub.c
> > fs/btrfs/sysfs.c
> > fs/xfs/xfs_sysfs.h: xfs_sysfs_init()
> >
> > Question: Do we fix the misuse or fix the API?
>
> Fix the misuse.

Cool, I got it!

> > $ git grep kobject_init_and_add | wc -l
> > 117
> >
> > Either way, we will have to go through all 117 call sites and check them.
>
> Yes. Same for other functions like device_add(), that is the "pattern"
> those users must follow.
>
> > I
> > don't mind fixing them all but I don't want to do it twice because I chose the
> > wrong option. Reaching out to those more experienced for a suggestion please.
> >
> > Fix the API
> > -----------
> >
> > Typically init functions do not require cleanup if they fail, this argument
> > leads to this patch
> >
> > diff --git a/lib/kobject.c b/lib/kobject.c
> > index aa89edcd2b63..62328054bbd0 100644
> > --- a/lib/kobject.c
> > +++ b/lib/kobject.c
> > @@ -453,6 +453,9 @@ int kobject_init_and_add(struct kobject *kobj, struct kobj_type *ktype,
> > retval = kobject_add_varg(kobj, parent, fmt, args);
> > va_end(args);
> >
> > + if (retval)
> > + kobject_put(kobj);
> > +
> > return retval;
> > }
> > EXPORT_SYMBOL_GPL(kobject_init_and_add);
>
> I would _love_ to do this, but realize what a kobject really is.
>
> It's just a "base object" that is embedded inside of some other object.
> The kobject core has no idea what is going on outside of itself. If the
> kobject_init_and_add() function fails, it can NOT drop the last
> reference on itself, as that would cause the memory owned by the _WHOLE_
> structure the kobject is embedded in, to be freed.
>
> And the kobject core can not "know" that something else needed to be
> done _before_ that memory could be freed. What if the larger structure
> needs to have some other destructor called on it first? What if
> some other api initialization needs to be torn down.
>
> As an example, consider this code:
>
> struct foo {
> struct kobject kobj;
> struct baz *baz;
> };
>
> void foo_release(struct kobject *kobj)
> {
> struct foo *foo = container_of(kobj, struct foo, kobj);
> kfree(foo);
> }
>
> struct kobj_type foo_ktype = {
> .release = foo_release,
> };
>
> struct foo *foo_create(struct foo *parent, char *name)
> {
> struct *foo;
>
> foo = kzalloc(sizeof(*foo), GFP_KERNEL);
> if (!foo)
> return NULL;
>
> foo->baz = baz_create(name);
> if (!foo->baz)
> return NULL;
>
> ret = kobject_init_and_add(&foo->kobj, foo_ktype, &parent->kobj, "foo-%s", name);
> if (ret) {
> baz_destroy(foo->baz);
> kobject_put(&foo->kobj);
> return NULL;
> }
>
> return foo;
> }
>
> void foo_destroy(struct foo *foo)
> {
> baz_destroy(foo->baz);
> kobject_del(&foo->kobj);
> }
>
> Now if kobject_init_and_add() had failed, and called kobject_put() right
> away, that would have freed the larger "struct foo", but not cleaned up
> the reference to the baz pointer.
>
> Yes, you can move all of the other destruction logic into the release
> function, to then get rid of baz, but that really doesn't work in the
> real world as there are times you want to drop that when you "know" you
> can drop it, not when the last reference goes away as those are
> different lifecycles.
>
> Same thing goes for 'struct device'. It too is a kobject, so think
> about if the driver core's call to initialize the kobject failed, would
> it be ok at that exact moment in time to free everything?
>
> Look at the "joy" that is device_add(). If kobject_add() fails, we have
> to clean up the glue directory that we had created, _before_ we can then
> call put_device(). Then stack another layer on top of that, look at
> usb_new_device(). If the call to device_add() fails, it needs to do
> some housekeeping before it can drop the last reference to the device to
> free the memory up.

Thanks for the detailed response, unusual objects call for unusual
coding practices eh ;)

> > Fix all the call sites
> > ----------------------
> >
> > Go through all 117 call sites and add kobj_put() in the error path.
>
> Yes.
>
> > This example from mm/slub.c
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index d30ede89f4a6..84a9d6c06c27 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -5756,8 +5756,10 @@ static int sysfs_slab_add(struct kmem_cache *s)
> >
> > s->kobj.kset = kset;
> > err = kobject_init_and_add(&s->kobj, &slab_ktype, NULL, "%s", name);
> > - if (err)
> > + if (err) {
> > + kobject_put(&s->kobj);
> > goto out;
> > + }
>
> Yup, it sucks, but unless you can think of a better way to do all of
> this, that's the requirement here. Again, same thing for a call to
> device_add(). Another subsystem got burned by this just the other day
> and we added yet-another-line in the documentation trying to call it out
> explicitly.
>
> Kernel programming is hard, sorry, let's go shopping...

lols, and good documentation is at times even harder than kernel
programming.

Thanks, patches to follow.


Tobin.