Re: [RFC 10/10] kmod: add a sanity check on module loading

From: Rusty Russell
Date: Mon Dec 19 2016 - 22:06:08 EST


"Luis R. Rodriguez" <mcgrof@xxxxxxxxxx> writes:
> On Dec 16, 2016 9:54 PM, "Rusty Russell" <rusty@xxxxxxxxxxxxxxx> wrote:
> > AFAICT the mistake here is that kmod is returning "done, OK" when the
> > module it is trying to load is already loading (but not finished
> > loading). That's the root problem; it's an attempt at optimization by
> > kmod which goes awry.
>
> This is true! To be precise though the truth of the matter is that kmod'd
> respective usermode helper: modprobe can be buggy and may lie to us. It may
> allow request_module() to return 0 but since we don't validate it, any
> assumption we make can be deadly. In the case of get_fs_type() its a null
> dereference.

Wait, what?? I can't see that in get_fs_type, which hasn't changed
since 2013. If a caller is assuming get_fs_type() doesn't return NULL,
they're broken and need fixing of course:

struct file_system_type *get_fs_type(const char *name)
{
struct file_system_type *fs;
const char *dot = strchr(name, '.');
int len = dot ? dot - name : strlen(name);

fs = __get_fs_type(name, len);
if (!fs && (request_module("fs-%.*s", len, name) == 0))
fs = __get_fs_type(name, len);

if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {
put_filesystem(fs);
fs = NULL;
}
return fs;
}

Where does this NULL-deref is the module isn't correctly loaded?

> *Iff* we want a sanity check to verify kmod's umh is not lying to us we
> need to verify after 0 was returned that it was not lying to us. Since kmod
> accepts aliases but find_modules_all() only works on the real module name a
> validation check cannot happen when all you have are aliases.

request_module() should block until resolution, but that's fundamentally
a userspace problem. Let's not paper over it in kernelspace.

> *Iff* we are sure we don't want a validation (or another earlier
> optimization to avoid calling out to modrobe if the alias requested is
> already present, which does the time shaving I mentioned on the tests) then
> naturally no request_module() calls returning 0 can assert information
> about the requested module. I think we might need to change more code if we
> accept we cannot trust request_module() calls, or we accept userspace
> telling the kernel something may mean we sometimes crash. This later
> predicament seems rather odd so hence the patch.
>
> Perhaps in some cases validation of work from a umh is not critical in
> kernel but for request_module() I can tell you that today get_fs_type code
> currently asserts the module found can never be NULL.

OK, what am I missing in the code above?

> > Looking at the code in the kernel, we *already* get this right: block if
> > a module is still loading anyway. Once it succeeds we return -EBUSY if
> >
> > it fails we'll proceed to try to load it again.
> >
> > I don't understand what you're trying to fix with adding aliases
> > in-kernel?
>
> Two fold now:
>
> a) validation on request_module() work when an alias is used

But why?

> b) since kmod accepts aliaes, if we get aliases support, it means we could
> *also* preemptively avoid calling out to userspace for modules already
> present.

No, because once we have a module we don't request it: requesting is the
fallback case.

> >> FWIW a few things did occur to me:
> >>
> >> a) list_add_rcu() is used so new modules get added first
> >
> > Only after we're sure that there are no duplicates.
> >
> >
> OK! This is a very critical assertion. I should be able to add a debug
> WARN_ON() should two modules be on the modules list for the same module
> then ?

Yes, names must be unique.

>> b) find_module_all() returns the last module which was added as it
> traverses
>> the module list
>
>> BTW should find_module_all() use rcu to traverse?
>
> Yes; the kallsyms code does this on Oops. Not really a big issue in
> practice, but a nice fix.
>
> Ok, will bundle into my queue.

Please submit to Jessica for her module queue, as it's orthogonal
AFAICT.

> I will note though that I still think there's a bug in this code --
> upon a failure other "spinning" requests can fail, I believe this may
> be due to not having another state or informing pending modules too
> early of a failure but I haven't been able to prove this conjecture
> yet.

That's possible, but I can't see it from quickly re-checking the code.

The module should be fully usable at this point; the module's init has
been called successfully, so in the case of __get_fs_type() it should
now succeed. The module cleans up its init section, but that should be
independent.

If there is a race, it's likely to be when some other caller wakes the
queue. Moving the wakeup as soon as possible should make it easier to
trigger:

diff --git a/kernel/module.c b/kernel/module.c
index f57dd63186e6..78bd89d41a22 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3397,6 +3397,7 @@ static noinline int do_init_module(struct module *mod)

/* Now it's a first class citizen! */
mod->state = MODULE_STATE_LIVE;
+ wake_up_all(&module_wq);
blocking_notifier_call_chain(&module_notify_list,
MODULE_STATE_LIVE, mod);

@@ -3445,7 +3446,6 @@ static noinline int do_init_module(struct module *mod)
*/
call_rcu_sched(&freeinit->rcu, do_free_init);
mutex_unlock(&module_mutex);
- wake_up_all(&module_wq);

return 0;


Thanks,
Rusty.