Re: [PATCH 2/2] livepatch: fix patched module loading race

From: Petr Mladek
Date: Thu Mar 05 2015 - 09:23:40 EST


On Thu 2015-03-05 09:52:41, Masami Hiramatsu wrote:
> (2015/03/04 22:17), Petr Mladek wrote:
> > On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote:
> >> It's possible for klp_register_patch() to see a module before the COMING
> >> notifier is called, or after the GOING notifier is called.
> >>
> >> That can cause all kinds of ugly races. As Pter Mladek reported:
> >>
> >> "The problem is that we do not keep the klp_mutex lock all the time when
> >> the module is being added or removed.
> >>
> >> First, the module is visible even before ftrace is ready. If we enable a patch
> >> in this time frame, adding ftrace ops will fail and the patch will get rejected
> >> just because bad timing.
> >
> > Ah, this is not true after all. I did not properly check when
> > MODULE_STATE_COMING was set. I though that it was before ftrace was
> > initialized but it was not true.
> >
> >
> >> Second, if we are "lucky" and enable the patch for the coming module when the
> >> ftrace is ready but before the module notifier has been called. The notifier
> >> will try to enable the patch as well. It will detect that it is already patched,
> >> return error, and the module will get rejected just because bad
> >> timing. The more serious problem is that it will not call the notifier for
> >> going module, so that the mess will stay there and we wont be able to load
> >> the module later.
> >
> > Ah, the race is there but the effect is not that serious in the
> > end. It seems that errors from module notifiers are ignored. In fact,
> > we do not propagate the error from klp_module_notify_coming(). It means
> > that WARN() from klp_enable_object() will be printed but the module
> > will be loaded and patched.
> >
> > I am sorry, I was confused by kGraft where kgr_module_init() was
> > called directly from module_load(). The errors were propagated. It
> > means that kGraft rejects module when the patch cannot be applied.
> >
> > Note that the current solution is perfectly fine for the simple
> > consistency model.
> >
> >
> >> Third, similar problems are there for going module. If a patch is enabled after
> >> the notifier finishes but before the module is removed from the list of modules,
> >> the new patch will be applied to the module. The module might disappear at
> >> anytime when the patch enabling is in progress, so there might be an access out
> >> of memory. Or the whole patch might be applied and some mess will be left,
> >> so it will not be possible to load/patch the module again."
> >
> > This is true.
>
> No, that's not true if you try_get_module() before patching. After the
> module state goes GOING (more correctly say, after try_release_module_ref()
> succeeded), all try_get_module() must fail :)
> So, please make sure to get module when applying patches.

The question is what to do if we could not get the module reference
because the module is going. We must not ignore the module because there is
a window when the module code could still get called, see
https://lkml.org/lkml/2015/3/4/776

It would be possible to wait until the module is removed but it might
take quite some time, especially if the mod->exit() function is complex
and need to wait for something.

Or we could refuse to add the patch but it is ugly.

I tend to go back and use the original idea with a boolean that is
updated when the module going notifier is called. It sets the border,
when it is safe to ignore the going module. There is no waiting,
no rejection.

I am actually getting near to send rfc patch v2 with this implementation.

Best Regards,
Petr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/