Re: [PATCH] kernel: reduce required permission for prctl_set_mm

From: Eric W. Biederman
Date: Wed Feb 12 2014 - 18:14:28 EST


Andrey Vagin <avagin@xxxxxxxxxx> writes:

> Currently prctl_set_mm requires the global CAP_SYS_RESOURCE,
> this patch reduce requiremence to CAP_SYS_RESOURCE in the current
> namespace.
>
> When we restore a task we need to set up text, data and data heap sizes
> from userspace to the values a task had at checkpoint time.
>
> Currently we can not restore these parameters, if a task lives in
> a non-root user name space, because it has no capabilities in the
> parent namespace.
>
> prctl_set_mm() changes parameters of the current task and doesn't affect
> other tasks.
>
> This patch affects the RLIMIT_DATA limit, because a consumtiuon is
> calculated relatively to mm->end_data, mm->start_data, mm->start_brk.
>
> rlim = rlimit(RLIMIT_DATA);
> if (rlim < RLIM_INFINITY && (brk - mm->start_brk) +
> (mm->end_data - mm->start_data) > rlim)
> goto out;
>
> This limit affects calls to brk() and sbrk(), but it doesn't affect
> mmap. So I think requirement of CAP_SYS_RESOURCE in the current
> namespace is enough for this limit.

Ick. No.

You do not have an argument for reducing the capable call here to
ns_capable. ns_capable(current_user_ns(), CAP_SYS_RESOURCE) does not
currently allow anything. If ns_capable(current_user_ns(),
CAP_SYS_RESOURCE) were to allow things there would still need to be a
check for a root setable maximum which is not present in this patch.

Either you have an argument for completely removing the capability check
or your reasoning is broken.

Reading through the code and reading through brk I an fairly confident
that your reasoning is broken.

The rlimit test needs to be when any of start_brk, end_data, or
start_data are changed, and that test is most definitely not performed.

Checks for enforcing the stack_size are completely missing.

It does look like with care we can remove or make much more precise the
capable checks from in prctl_set_mm but this patch definitely does not
take that needed care.

Nacked-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>

> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
> Cc: Robin Holt <holt@xxxxxxx>
> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> Cc: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
> Cc: Chen Gang <gang.chen@xxxxxxxxxxx>
> Cc: Stephen Rothwell <sfr@xxxxxxxxxxxxxxxx>
> Cc: Pavel Emelyanov <xemul@xxxxxxxxxxxxx>
> Cc: Aditya Kali <adityakali@xxxxxxxxxx>
> Cc: security@xxxxxxxxxx
> Signed-off-by: Andrey Vagin <avagin@xxxxxxxxxx>
> ---
> kernel/sys.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sys.c b/kernel/sys.c
> index c0a58be..6f36fb3 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -1701,7 +1701,7 @@ static int prctl_set_mm(int opt, unsigned long addr,
> if (arg5 || (arg4 && opt != PR_SET_MM_AUXV))
> return -EINVAL;
>
> - if (!capable(CAP_SYS_RESOURCE))
> + if (!ns_capable(current_user_ns(), CAP_SYS_RESOURCE))
> return -EPERM;
>
> if (opt == PR_SET_MM_EXE_FILE)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/