Re: [PATCH v8 11/12] zram: fix crashes with cpu hotplug multistate
From: Petr Mladek
Date: Tue Nov 02 2021 - 10:51:46 EST
On Tue 2021-11-02 15:15:19, Petr Mladek wrote:
> On Tue 2021-10-26 23:37:30, Ming Lei wrote:
> > On Tue, Oct 26, 2021 at 10:48:18AM +0200, Petr Mladek wrote:
> > > Below are more details about the livepatch code. I hope that it will
> > > help you to see if zram has similar problems or not.
> > >
> > > We have kobject in three structures: klp_func, klp_object, and
> > > klp_patch, see include/linux/livepatch.h.
> > >
> > > These structures have to be statically defined in the module sources
> > > because they define what is livepatched, see
> > > samples/livepatch/livepatch-sample.c
> > >
> > > The kobject is used there to show information about the patch, patched
> > > objects, and patched functions, in sysfs. And most importantly,
> > > the sysfs interface can be used to disable the livepatch.
> > >
> > > The problem with static structures is that the module must stay
> > > in the memory as long as the sysfs interface exists. It can be
> > > solved in module_exit() callback. It could wait until the sysfs
> > > interface is destroyed.
> > >
> > > kobject API does not support this scenario. The relase() callbacks
> >
> > kobject_delete() is for supporting this scenario, that is why we don't
> > need to grab module refcnt before calling show()/store() of the
> > kobject's attributes.
> >
> > kobject_delete() can be called in module_exit(), then any show()/store()
> > will be done after kobject_delete() returns.
>
> I am a bit confused. I do not see kobject_delete() anywhere in kernel
> sources.
>
> I see only kobject_del() and kobject_put(). AFAIK, they do _not_
> guarantee that either the sysfs interface was destroyed or
> the release callbacks were called. For example, see
> schedule_delayed_work(&kobj->release, delay) in kobject_release().
Grr, I always get confused by the code. kobject_del() actually waits
until the sysfs interface gets destroyed. This is why there is
the deadlock.
But kobject_put() is _not_ synchronous. And the comment above
kobject_add() repeat 3 times that kobject_put() must be called
on success:
* Return: If this function returns an error, kobject_put() must be
* called to properly clean up the memory associated with the
* object. Under no instance should the kobject that is passed
* to this function be directly freed with a call to kfree(),
* that can leak memory.
*
* If this function returns success, kobject_put() must also be called
* in order to properly clean up the memory associated with the object.
*
* In short, once this function is called, kobject_put() MUST be called
* when the use of the object is finished in order to properly free
* everything.
and similar text in Documentation/core-api/kobject.rst
After a kobject has been registered with the kobject core successfully, it
must be cleaned up when the code is finished with it. To do that, call
kobject_put().
If I read the code correctly then kobject_put() calls kref_put()
that might call kobject_delayed_cleanup(). This function does a lot
of things and need to access struct kobject.
> IMHO, kobject API does not support static structures and module
> removal.
If kobject_put() has to be called also for static structures then
module_exit() must explicitly wait until the clean up is finished.
Best Regards,
Petr