Re: [PATCH v2 2/3] livepatch: add atomic replace
From: Greg KH
Date: Fri Sep 15 2017 - 20:10:56 EST
On Fri, Sep 15, 2017 at 05:46:58PM +0200, Petr Mladek wrote:
> On Thu 2017-09-14 08:31:24, Jason Baron wrote:
> >
> >
> > On 09/14/2017 06:32 AM, Petr Mladek wrote:
> > > On Tue 2017-09-12 23:47:32, Jason Baron wrote:
> > >>
> > >>
> > >> On 09/12/2017 01:35 AM, Petr Mladek wrote:
> > >>> On Mon 2017-09-11 23:46:28, Jason Baron wrote:
> > >>>> On 09/11/2017 09:53 AM, Petr Mladek wrote:
> > >>>>> On Wed 2017-08-30 17:38:44, Jason Baron wrote:
> > >>>>>> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > >>>>>> index 6004be3..21cecc5 100644
> > >>>>>> --- a/kernel/livepatch/core.c
> > >>>>>> +++ b/kernel/livepatch/core.c
> > >>>>>> @@ -600,13 +603,38 @@ static void klp_free_patch(struct klp_patch *patch)
> > >>>>>> list_del(&patch->list);
> > >>>>>> }
> > >>>>>>
> > >>>>>> -static int klp_init_func(struct klp_object *obj, struct klp_func *func)
> > >>>>>> +void klp_patch_free_no_ops(struct klp_patch *patch)
> > >>>>>> +{
> > >>>>>> + struct klp_object *obj, *tmp_obj;
> > >>>>>> + struct klp_func *func, *tmp_func;
> > >>>>>> +
> > >>>>>> + klp_for_each_object(patch, obj) {
> > >>>>>> + list_for_each_entry_safe(func, tmp_func, &obj->func_list,
> > >>>>>> + func_entry) {
> > >>>>>> + list_del_init(&func->func_entry);
> > >>>>>> + kobject_put(&func->kobj);
> > >>>>>> + kfree(func->old_name);
> > >>>>>> + kfree(func);
> > >>>>>
> > >>>>> kobject_put() is asynchronous. The structure should be freed using
> > >>>>> the kobject release method.
> > >>>>>
> > >>>>> The question is how secure this should be. We might want to block
> > >>>>> other operations with the patch until all the release methods
> > >>>>> are called. But this might be tricky as there is not a single
> > >>>>> root kobject that would get destroyed at the end.
> > >>>>>
> > >>>>> A crazy solution would be to define a global atomic counter.
> > >>>>> It might get increased with each kobject_put() call and
> > >>>>> descreased in each release method. Then we could wait
> > >>>>> until it gets zero.
> > >>>>>
> > >>>>> It should be safe to wait with klp_mutex taken. Note that this
> > >>>>> is not possible with patch->kobj() where the "the enable"
> > >>>>> attribute takes the mutex as well, see
> > >>>>> enabled_store() in kernel/livepatch/core.c.
> > >>>>
> > >>>> Thanks for pointing this out - it looks like the livepatch code uses
> > >>>> wait_for_completion() with special kobj callbacks. Perhaps, there could
> > >>>> be a nop callback, but I'd have to look at this more closely...
> > >>>
> > >>> The completion is usable for the root kobject but you do not free
> > >>> it here. It might be pretty complicated if you need separate
> > >>> completion for all the freed kobjects.
> > >>>
> > >>> A solution might be a global atomic counter and a waitqueue.
> > >>> Feel free to ask for more details.
> > >>>
> > >>
> > >> So the issue is that the user might access some of the klp_* data
> > >> structures via /sysfs after we have already freed them?
> > >
> > > yes
> > >
> > >> if so, this seems to be a common kernel pattern:
> > >>
> > >> kobject_del(kobj);
> > >> kobject_put(kobj);
> > >
> > > IMHO, this is used for other reason.
> > >
> > > kobject_del() removes the object from the hierachy. Therefore
> > > it prevents new operations. But it does not wait for the exiting
> > > operations to finish. Therefore there still might be users that
> > > access the data even after this function finishes.
> >
> > I looked into this further - and it does appear to wait until all
> > operations finish. In kernfs_drain() the code does:
> >
> > wait_event(root->deactivate_waitq, atomic_read(&kn->active) ==
> > KN_DEACTIVATED_BIAS);
> >
> > The call stack is:
> >
> > kobject_del()
> > sysfs_remove_dir()
> > kernfs_remove()
> > __kernfs_remove()
> > kernfs_drain()
> >
> > And __kernfs_remove() has already done a 'deactivate' prior:
> >
> > /* prevent any new usage under @kn by deactivating all nodes */
> >
> > pos = NULL;
> >
> > while ((pos = kernfs_next_descendant_post(pos, kn)))
> >
> > if (kernfs_active(pos))
> >
> > atomic_add(KN_DEACTIVATED_BIAS, &pos->active);
> >
> >
> > So I *think* doing the kobject_del() first is sufficient. It may be
> > worth some better documented if that is the case...
>
> Sigh, I am confused. Your arguments look convincing. But I still
> feel a fear. As I said, we had had a long discussion about this long
> time ago. Unfortunately, I do not remember all the details.
>
> In each case, there is the following mentioned in
> Documentation/kobject.txt:
>
> === cut ===
> Once you registered your kobject via kobject_add(), you must never use
> kfree() to free it directly. The only safe way is to use kobject_put(). It
> is good practice to always use kobject_put() after kobject_init() to avoid
> errors creeping in.
>
> This notification is done through a kobject's release() method. Usually
> such a method has a form like::
>
> void my_object_release(struct kobject *kobj)
> {
> struct my_object *mine = container_of(kobj, struct my_object, kobj);
>
> /* Perform any additional cleanup on this object, then... */
> kfree(mine);
> }
>
> One important point cannot be overstated: every kobject must have a
> release() method, and the kobject must persist (in a consistent state)
> until that method is called. If these constraints are not met, the code is
> flawed. Note that the kernel will warn you if you forget to provide a
> release() method. Do not try to get rid of this warning by providing an
> "empty" release function; you will be mocked mercilessly by the kobject
> maintainer if you attempt this.
>
> Note, the name of the kobject is available in the release function, but it
> must NOT be changed within this callback. Otherwise there will be a memory
> leak in the kobject core, which makes people unhappy.
> === cut ===
>
>
> The fact is that if you enable CONFIG_DEBUG_KOBJECT_RELEASE, it will
> make a deferred access to the struct kobject by intention.
> See schedule_delayed_work() in kobject_release.
>
> The struct kobject is part of both struct klp_func and
> struct klp_object. The delayed call will cause an access
> to an already freed memory if you free the structures
> immediately after calling kobject_put().
>
>
> I am not sure why this is. Either we missed something and
> kernfs_drain() is not enough to avoid extra references
> of the kobject. Or kobject authors are over paranoid and
> push people to the only supported and "right design"
> a hard way.
>
> Anyway, we either need to "fix" kobject implementation
> or we need to free the structures in the respective
> release callbacks.
You read the documentation correctly, you need to free your structures
in the release callbacks, kobject is doing this correctly.
Also, it's kind of hard to keep an extra reference around kernfs does a
good job of keeping the kernel from crashing for issues like this. Try
enabling the above mentioned kernel option, along with slab poisoning
and watch your code blow up :)
thanks,
greg k-h