Re: [RFC PATCH] livepatch: allow removal of a disabled patch

From: Josh Poimboeuf
Date: Mon May 02 2016 - 11:08:53 EST


On Mon, May 02, 2016 at 01:57:22PM +0200, Miroslav Benes wrote:
> Currently we do not allow patch module to unload since there is no
> method to determine if a task is still running in the patched code.
>
> The consistency model gives us the way because when the patching
> finishes we know that all tasks were marked as safe to call a new
> patched function. Thus every new call to the function calls the new
> patched code and at the same time no task can be somewhere in the old
> code, because it had to leave that code to be marked as safe.
>
> We can safely let the patch module go after that.

I found this a little confusing because it talks about patching, whereas
we really want to remove the patch module after _unpatching_ it.

> Completion is used for synchronization between module removal and sysfs
> infrastructure. Note that we still do not allow the removal for
> immediate model, that is no consistency model.
>
> With this change a call to try_module_get() is moved to
> __klp_enable_patch from klp_register_patch to make module reference
> counting symmetric (module_put() is in a patch disable path) and to
> allow to take a new reference to a disabled module when being enabled.

One minor issue here: for patch->immediate, if somebody enabled and
disabled the patch several times, the module ref count would keep
increasing. Probably not a big deal... maybe we don't need to worry
about it.

> Also all kobject_put(&patch->kobj) calls are moved outside of klp_mutex
> lock protection to prevent a deadlock situation when
> klp_unregister_patch is called and sysfs directories are removed. There
> is no need to do the same for other kobject_put() callsites as we
> currently do not have their sysfs counterparts.
>
> Signed-off-by: Miroslav Benes <mbenes@xxxxxxx>
> ---
> Based on Josh's v2.
>
> There are still two things which need to be discussed.
>
> 1. Do we really need a completion? If I am not missing something
> kobject_del() always waits for sysfs callers to leave thanks to kernfs
> active protection. The only problem is in kobject_release() when
> CONFIG_DEBUG_KOBJECT_RELEASE=y. In this case kobject_delayed_cleanup()
> is delay-scheduled, which is a problem for us. For this we definitely
> need the completion, right? JiriS, is this correct?

Sysfs and kobjects hurt my brain; I need to take a refresher course, so
I don't have an answer to this yet :-) Maybe Jiri S can enlighten us.

> 2. I refuse to remove a module when patch->immediate is set or one of
> its func->immediate is set. We can do slightly better, because we can
> allow the module to go if patch->immediate is false and all func with
> immediate set to true are from some object which is not loaded. This is
> for discussion because it needs more changes in the code (I'd like to
> call klp_is_object_loaded() which is somewhere else and so on).

I don't think we need to worry about doing that unless somebody
complains about it. It's kind of obscure and patch removal isn't very
important anyway...

--
Josh