Re: [PATCH] userns: move user access out of the mutex
From: Jann Horn
Date: Tue Jun 26 2018 - 10:07:02 EST
On Tue, Jun 26, 2018 at 3:08 PM Christian Brauner
<christian.brauner@xxxxxxxxxxxxx> wrote:
>
> On Mon, Jun 25, 2018 at 06:34:19PM +0200, Jann Horn wrote:
> > The old code would hold the userns_state_mutex indefinitely if
> > memdup_user_nul stalled due to e.g. a userfault region. Prevent that by
> > moving the memdup_user_nul in front of the mutex_lock().
> >
> > Note: This changes the error precedence of invalid buf/count/*ppos vs
> > map already written / capabilities missing.
> >
> > Fixes: 22d917d80e84 ("userns: Rework the user_namespace adding uid/gid...")
> > Cc: stable@xxxxxxxxxxxxxxx
> > Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
> > ---
> > kernel/user_namespace.c | 24 ++++++++++--------------
> > 1 file changed, 10 insertions(+), 14 deletions(-)
> >
> > diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> > index c3d7583fcd21..e5222b5fb4fe 100644
> > --- a/kernel/user_namespace.c
> > +++ b/kernel/user_namespace.c
> > @@ -859,7 +859,16 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> > unsigned idx;
> > struct uid_gid_extent extent;
> > char *kbuf = NULL, *pos, *next_line;
> > - ssize_t ret = -EINVAL;
> > + ssize_t ret;
> > +
> > + /* Only allow < page size writes at the beginning of the file */
> > + if ((*ppos != 0) || (count >= PAGE_SIZE))
> > + return -EINVAL;
> > +
> > + /* Slurp in the user data */
> > + kbuf = memdup_user_nul(buf, count);
> > + if (IS_ERR(kbuf))
> > + return PTR_ERR(kbuf);
>
> I'm not opposed to this but what is annoying is the changed error
> reporting you pointed out. It seems conceptually way cleaner if missing
> permissions are reported before more specific internal errors.
>
> The question I have is how bad it would be if memdup_user_nul() stalled
> and if you see any issues security wise. Given that the mutex is only
> taken on operations that are mostly performed when creating or setting
> up a new user namespace
>
> map_write()
> create_user_ns()
> proc_setgroups_write()
> userns_may_setgroups()
>
> and not when actually interacting with it it seems the worst that
> happens is that creation of new user namespaces is not possible anymore.
> That shouldn't have any effect on the host though which I would see as a
> real issue. But I might be missing context. :)
userns_may_setgroups() is involved in sys_setgroups() via
may_setgroups(), so if one thread is blocking the userns_state_mutex,
nobody can log in anymore.
> > /*
> > * The userns_state_mutex serializes all writes to any given map.
> > @@ -895,19 +904,6 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> > if (cap_valid(cap_setid) && !file_ns_capable(file, ns, CAP_SYS_ADMIN))
> > goto out;
> >
> > - /* Only allow < page size writes at the beginning of the file */
> > - ret = -EINVAL;
> > - if ((*ppos != 0) || (count >= PAGE_SIZE))
> > - goto out;
> > -
> > - /* Slurp in the user data */
> > - kbuf = memdup_user_nul(buf, count);
> > - if (IS_ERR(kbuf)) {
> > - ret = PTR_ERR(kbuf);
> > - kbuf = NULL;
> > - goto out;
> > - }
> > -
> > /* Parse the user data */
> > ret = -EINVAL;
> > pos = kbuf;
> > --
> > 2.18.0.rc2.346.g013aa6912e-goog
> >