Re: [PATCH V4 1/3] livepatch: remove 'struct completion finish' from klp_patch
From: Ming Lei
Date: Fri Nov 05 2021 - 08:04:45 EST
On Wed, Nov 03, 2021 at 01:52:29PM +0100, Petr Mladek wrote:
> On Wed 2021-11-03 08:51:32, Ming Lei wrote:
> > On Tue, Nov 02, 2021 at 04:56:10PM +0100, Petr Mladek wrote:
> > > On Tue 2021-11-02 22:59:30, Ming Lei wrote:
> > > > The completion finish is just for waiting release of the klp_patch
> > > > object, then releases module refcnt. We can simply drop the module
> > > > refcnt in the kobject release handler of klp_patch.
> > > >
> > > > This way also helps to support allocating klp_patch from heap.
>
> First, I am sorry for confusion. The above description is correct.
> I does not say anything about that kobject_put() is synchronous.
>
> > > IMHO, this is wrong assumption. kobject_put() might do everyting
> > > asynchronously, see:
>
> I see that you are aware of this behavior.
>
> > > kobject_put()
> > > kobject_release()
> > > INIT_DELAYED_WORK(&kobj->release, kobject_delayed_cleanup);
> > > schedule_delayed_work(&kobj->release, delay);
> > >
> > > asynchronously:
> > >
> > > kobject_delayed_cleanup()
> > > kobject_cleanup()
> > > __kobject_del()
> >
> > OK, this is one generic kobject release vs. module unloading issue to
> > solve, not unique for klp module, and there should be lots of drivers
> > suffering from it.
>
> Yup, the problem is generic. It would be nice to have a generic
> solution. For example, add kobject_release_sync() that would return
> only when the object is really released.
The generic solution has been posted out:
https://lore.kernel.org/lkml/20211105063710.4092936-1-ming.lei@xxxxxxxxxx/
which needn't any generic API change, just flushes all scheduled kobject
cleanup work before freeing module, and the change is transparent for
drivers.
IMO, kobject_release_sync() is one wrong direction for fixing this
issue, since it is basically impossible to audit if one kobject_put()
need to be replaced with kobject_release_sync().
>
> > > > --- a/kernel/livepatch/core.c
> > > > +++ b/kernel/livepatch/core.c
> > > > @@ -678,11 +678,6 @@ static void klp_free_patch_finish(struct klp_patch *patch)
> > > > * cannot get enabled again.
> > > > */
> > > > kobject_put(&patch->kobj);
> > > > - wait_for_completion(&patch->finish);
> > > > -
> > > > - /* Put the module after the last access to struct klp_patch. */
> > > > - if (!patch->forced)
> > > > - module_put(patch->mod);
> > >
> > > klp_free_patch_finish() does not longer wait until the release
> > > callbacks are called.
> > >
> > > klp_free_patch_finish() is called also in klp_enable_patch() error
> > > path.
> > >
> > > klp_enable_patch() is called in module_init(). For example, see
> > > samples/livepatch/livepatch-sample.c
> > >
> > > The module must not get removed until the release callbacks are called.
> > > Does the module loader check the module reference counter when
> > > module_init() fails?
> >
> > Good catch, that is really one corner case, in which the kobject has to
> > be cleaned up before returning from mod->init(), cause there can't be
> > module unloading in case of mod->init() failure.
>
> Just to be sure. We want to keep the safe behavior in this case.
> There are many situations when klp_enable() might fail. And the error
> handling must be safe.
>
> In general, livepatch developers are very conservative.
> Livepatches are not easy to create. They are used only by people
> who really want to avoid reboot. We want to keep the livepatch kernel
> framework as safe as possible to avoid any potential damage.
The posted patch can cover this situation in which module_init() fails.
Thanks,
Ming