Re: livepatch: Implement separate coming and going module notifiers

From: Jessica Yu
Date: Fri Jan 29 2016 - 17:58:47 EST


+++ Josh Poimboeuf [29/01/16 13:42 -0600]:
On Fri, Jan 29, 2016 at 08:25:15PM +0100, Miroslav Benes wrote:
On Fri, 29 Jan 2016, Josh Poimboeuf wrote:

> On Fri, Jan 29, 2016 at 12:40:14PM -0500, Steven Rostedt wrote:
> > [ Added Rusty, as he's still maintainer of the module code ]
> >
> > On Fri, 29 Jan 2016 11:30:10 -0600
> > Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> >
> > > On Fri, Jan 29, 2016 at 05:30:46PM +0100, Miroslav Benes wrote:
> > > > Otherwise than that it looks good. I agree there are advantages to split
> > > > the notifiers. For example we can replace the coming one with the function
> > > > call somewhere in load_module() to improve error handling if the patching
> > > > fails while loading a module. This would be handy with a consistency model
> > > > in the future.
> > >
> > > Yeah, we'll need something like that eventually. Though we'll need to
> > > make sure that ftrace_module_enable() is still called beforehand, after
> > > setting MODULE_STATE_COMING state, due to the race described in 5156dca.
> > >
> > > Something like:
> > >
> > > [note: klp_module_notify_coming() is replaced with klp_module_enable()]
> > >
> > > diff --git a/kernel/module.c b/kernel/module.c
> > > index 8358f46..aeabd81 100644
> > > --- a/kernel/module.c
> > > +++ b/kernel/module.c
> > > @@ -3371,6 +3371,13 @@ static int complete_formation(struct module *mod, struct load_info *info)
> > > mod->state = MODULE_STATE_COMING;
> > > mutex_unlock(&module_mutex);
> > >
> > > + ftrace_module_enable(mod);
> > > + err = klp_module_enable(mod);
> > > + if (err) {
> > > + ftrace_release_mod(mod);
> > > + return err;
> > > + }
> > > +
> > > blocking_notifier_call_chain(&module_notify_list,
> > > MODULE_STATE_COMING, mod);
> > > return 0;
> > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> > > index eca592f..c42cf37 100644
> > > --- a/kernel/trace/ftrace.c
> > > +++ b/kernel/trace/ftrace.c
> > > @@ -5045,9 +5045,6 @@ static int ftrace_module_notify(struct notifier_block *self,
> > > struct module *mod = data;
> > >
> > > switch (val) {
> > > - case MODULE_STATE_COMING:
> > > - ftrace_module_enable(mod);
> > > - break;
> > > case MODULE_STATE_GOING:
> > > ftrace_release_mod(mod);
> > > break;
> >
> > If we end up doing something like this, I would just say punt and have
> > the ftrace code be hardcoded into the module code and remove the
> > notifiers completely. ftrace (and live kernel patching for that matter)
> > are rather special. They are not a filesystem or driver. They are core
> > utilities and having them called directly from the module code may be
> > prudent and better to understand and control.
>
> Agreed, and we might as well make this change now to avoid more churn
> later.

It is possible to achieve the same goal even with the notifiers. They are
processed synchronously in complete_formation(). So we can put our klp
hook after that, right? Or better, put it to load_module() after
complete_formation() call. There is an error handling code even today
(that is, parse_args() or mod_sysfs_setup() can fail). Moreover, we'll
have a hook there with Jessica's relocation rework patch set.

Well, my feeling is that we should really apply livepatch relocations
before allowing any other notifiers to run, in case the relocations
affect them. But it's just a feeling; I don't have any specific
examples to justify it (yet).

So what I'm gathering from this discussion is that we are leaning
towards completely removing the ftrace notifier in favor of inserting direct
calls to ftrace_module_enable() and ftrace_release_mod() in the
module loader, as well as removing the livepatch coming module notifier
and hard-coding klp_module_enable() (formerly klp_module_notify_coming).

Since we're already doing all that, might we be able to just completely remove
the klp notifiers altogether as well, and replace klp_module_notify_going()
with something like klp_module_disable()? This way everything is symmetrical.

Then the whole thing might look something like this -

diff --git a/kernel/module.c b/kernel/module.c
index 8358f46..eccd289 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -979,8 +979,12 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
/* Final destruction now no one is using it. */
if (mod->exit != NULL)
mod->exit();
blocking_notifier_call_chain(&module_notify_list,
MODULE_STATE_GOING, mod);
+ klp_module_disable(mod);
+ ftrace_release_mod(mod);
+
async_synchronize_full();

/* Store the name of the last unloaded module for diagnostic purposes */
@@ -3371,6 +3375,13 @@ static int complete_formation(struct module *mod, struct load_info *info)
mod->state = MODULE_STATE_COMING;
mutex_unlock(&module_mutex);

+ ftrace_module_enable(mod);
+ err = klp_module_enable(mod); // write all relocations before calling coming notifiers
+ if (err) {
+ ftrace_release_mod(mod);
+ goto out;
+ }
+
blocking_notifier_call_chain(&module_notify_list,
MODULE_STATE_COMING, mod);
return 0;

The function call ordering here should emulate the same ordering were they
notifiers instead, with the priorities Josh suggested in the other mail.

On module load, ftrace_module_enable() is called first (as if it had priority
INT_MAX), then klp_module_enable() (INT_MAX-1), then the rest of the coming
notifier call chain. For the GOING part, the going call chain is called first,
then klp_module_disable() (INT_MIN+1), then ftrace_release_mod() last (INT_MIN).

Note: There are multiple places where the GOING notifiers are called (i.e. in
delete_module(), and in the error paths for do_init_module() and
load_module()), but the calls would look the same there as well.

Does this all sound OK?

Thanks,
Jessica