Re: [RFC] Expose request_module via syscall

From: Thomas Weißschuh
Date: Sun Sep 19 2021 - 03:56:37 EST


On 2021-09-18T11:47-0700, Andy Lutomirski wrote:
> On Thu, Sep 16, 2021, at 2:27 AM, Christian Brauner wrote:
> > On Wed, Sep 15, 2021 at 09:47:25AM -0700, Andy Lutomirski wrote:
> > > On Wed, Sep 15, 2021 at 8:50 AM Thomas Weißschuh <thomas@xxxxxxxx> wrote:
> > > >
> > > > Hi,
> > > >
> > > > I would like to propose a new syscall that exposes the functionality of
> > > > request_module() to userspace.
> > > >
> > > > Propsed signature: request_module(char *module_name, char **args, int flags);
> > > > Where args and flags have to be NULL and 0 for the time being.
> > > >
> > > > Rationale:
> > > >
> > > > We are using nested, privileged containers which are loading kernel modules.
> > > > Currently we have to always pass around the contents of /lib/modules from the
> > > > root namespace which contains the modules.
> > > > (Also the containers need to have userspace components for moduleloading
> > > > installed)
> > > >
> > > > The syscall would remove the need for this bookkeeping work.
> > >
> > > I feel like I'm missing something, and I don't understand the purpose
> > > of this syscall. Wouldn't the right solution be for the container to
> > > have a stub module loader (maybe doable with a special /sbin/modprobe
> > > or maybe a kernel patch would be needed, depending on the exact use
> > > case) and have the stub call out to the container manager to request
> > > the module? The container manager would check its security policy and
> > > load the module or not load it as appropriate.
> >
> > I don't see the need for a syscall like this yet either.
> >
> > This should be the job of the container manager. modprobe just calls the
> > init_module() syscall, right?
>
> Not quite so simple. modprobe parses things in /lib/modules and maybe /etc to decide what init_module() calls to do.
>
> But I admit I’m a bit confused. What exactly is the container doing that causes the container’s copy of modprobe to be called?

The container is running an instance of the docker daemon in swarm mode.
That needs the "ip_vs" module (amongst others) and explicitly tries to load it
via modprobe.

> > If so the seccomp notifier can be used to intercept this system call for
> > the container and verify the module against an allowlist similar to how
> > we currently handle mount.
> >
> > Christian
> >