Re: [PATCH] kmod: make request_module() return an error when autoloading is disabled

From: Luis Chamberlain
Date: Wed Mar 11 2020 - 02:31:43 EST


On Tue, Mar 10, 2020 at 10:26:20PM -0700, Eric Biggers wrote:
> On Wed, Mar 11, 2020 at 04:32:21AM +0000, Luis Chamberlain wrote:
> > On Tue, Mar 10, 2020 at 03:37:31PM -0700, Eric Biggers wrote:
> > > From: Eric Biggers <ebiggers@xxxxxxxxxx>
> > >
> > > It's long been possible to disable kernel module autoloading completely
> > > by setting /proc/sys/kernel/modprobe to the empty string. This can be
> > > preferable
> >
> > preferable but ... not documented. Or was this documented or recommended
> > somewhere?
> >
> > > to setting it to a nonexistent file since it avoids the
> > > overhead of an attempted execve(), avoids potential deadlocks, and
> > > avoids the call to security_kernel_module_request() and thus on
> > > SELinux-based systems eliminates the need to write SELinux rules to
> > > dontaudit module_request.
>
> Not that I know of, though I didn't look too hard. proc(5) mentions
> /proc/sys/kernel/modprobe but doesn't mention the empty string case.
>
> In any case, it's been supported for a long time, and it's useful for the
> reasons I mentioned.

Sure. I think then its important to document it as such then, or perhaps
make a kconfig option which sets this to empty and document it on the
kconfig entry.

> > > However, when module autoloading is disabled in this way,
> > > request_module() returns 0. This is broken because callers expect 0 to
> > > mean that the module was successfully loaded.
> >
> > However this is implicitly not true. For instance, as Neil recently
> > chased down -- blacklisting a module today returns 0 as well, and so
> > this corner case is implicitly set to return 0.
>
> That sounds like another similar bug, but in the modprobe program instead of in
> the kernel. Do you have a link to the discussion about it?

Nothing public yet AFAICT.

> > > But
> > > improperly returning 0 can indeed confuse a few callers, for example
> > > get_fs_type() in fs/filesystems.c where it causes a WARNING to be hit:
> > >
> > > if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
> > > fs = __get_fs_type(name, len);
> > > WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name);
> > > }
> > >
> > > This is easily reproduced with:
> > >
> > > echo > /proc/sys/kernel/modprobe
> > > mount -t NONEXISTENT none /
> > >
> > > It causes:
> > >
> > > request_module fs-NONEXISTENT succeeded, but still no fs?
> > > WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0
> > > [...]
> >
> > Thanks for reporting this.
> >
> > > Arguably this warning is broken and should be removed, since the module
> > > could have been unloaded already.
> >
> > No, the warning is present *because* debuggins issues for when the
> > module which did not load is a rootfs is *really* hard to debug. Then,
> > if the culprit of the issue is a userspace modprobe bug (it happens)
> > this makes debugging *very* difficult as you won't know what failed at
> > all, you just get a silent failed boot.
>
> I meant that it's broken to use WARN_ON(), because it's a userspace triggerable
> condition.

This and the blacklist case are now two known cases, so yes I'a agree
now. It was not widely known before.

> WARN_ON() is for kernel bugs only. Of course, if it's a useful
> warning, it can still be left in as pr_warn().

I'll send a patch.

> > > However, request_module() should also
> > > correctly return an error when it fails. So let's make it return
> > > -ENOENT, which matches the error when the modprobe binary doesn't exist.
> >
> > This is a user experience change though, and I wouldn't have on my radar
> > who would use this, and expects the old behaviour. Josh, would you by
> > chance?
> >
> > I'd like this to be more an RFC first so we get vetted parties to
> > review. I take it this and Neil's case are cases we should revisit now,
> > properly document as we didn't before, ensure we don't break anything,
> > and also extend the respective kmod selftests to ensure we don't break
> > these corner cases in the future.
>
> This patch only affects kernel internals, not the userspace API.

Ah yes, in that case this seems fine with me.

> So I don't see
> why it would be controversial? I already went through all callers of
> request_module() that check its return value, and they all appear to work better
> with -ENOENT, since they assume that 0 means the module was loaded.

Thanks for doing that, but I note that getting 0 is not assurance
either. The de-facto best practive for the request_module() call is to
do your own in place verifier.

> Incorrectly returning 0 typically causes unnecessary work (checking again
> whether the module's functionality is available) or misleading log messages.

Yes but returning 0 cannot be relied upon today for assuming the module
is loaded. *If* we revisit that decision and want the kernel to do a
generic verifier, then yes, we can get rid of all the caller specific
verfifiers, but not today.

> In
> fact, I can't think of a situation where kernel code would *want* 0 returned in
> this case, as it's ambiguous with the module being successfully loaded.

Unfortunately that's just how the API (to my mind silly) grew out to.

> Sure, I'll check whether it would be possible to add a test for this case in
> lib/test_kmod.c and tools/testing/selftests/kmod/.

Thanks!

Luis