Re: [RFC][PATCH v3] cgroup: Use CAP_SYS_RESOURCE to allow a process to migrate other tasks between cgroups
From: Michael Kerrisk (man-pages)
Date: Fri Oct 21 2016 - 02:42:36 EST
Hi John,
On 10/21/2016 03:24 AM, John Stultz wrote:
> This patch adds logic to allows a process to migrate other tasks
> between cgroups if they have CAP_SYS_RESOURCE.
This appears to be a patch against your previous patch,
rather than against mainline. Was that intended?
Cheers,
Michael
> In Android (where this feature originated), the ActivityManager tracks
> various application states (TOP_APP, FOREGROUND, BACKGROUND, SYSTEM,
> etc), and then as applications change states, the SchedPolicy logic
> will migrate the application tasks between different cgroups used
> to control the different application states (for example, there is a
> background cpuset cgroup which can limit background tasks to stay
> on one low-power cpu, and the bg_non_interactive cpuctrl cgroup can
> then further limit those background tasks to a small percentage of
> that one cpu's cpu time).
>
> However, for security reasons, Android doesn't want to make the
> system_server (the process that runs the ActivityManager and
> SchedPolicy logic), run as root. So in the Android common.git
> kernel, they have some logic to allow cgroups to loosen their
> permissions so CAP_SYS_NICE tasks can migrate other tasks between
> cgroups.
>
> I feel the approach taken there overloads CAP_SYS_NICE a bit much,
> and is maybe more complicated then needed.
>
> So this patch, as suggested by Michael Kerrisk, simply adds a
> check for CAP_SYS_RESOURCE.
>
> I've tested this with AOSP master, and this seems to work well
> as Zygote and system_server already use CAP_SYS_RESOURCE. I've
> also submitted patches against the android-4.4 kernel to change
> it to use CAP_SYS_RESOURCE, and the Android developers have
> seemed ok with this change.
>
> Thouhts and feedback would be appreciated!
>
> Cc: Tejun Heo <tj@xxxxxxxxxx>
> Cc: Li Zefan <lizefan@xxxxxxxxxx>
> Cc: Jonathan Corbet <corbet@xxxxxxx>
> Cc: cgroups@xxxxxxxxxxxxxxx
> Cc: Android Kernel Team <kernel-team@xxxxxxxxxxx>
> Cc: Rom Lemarchand <romlem@xxxxxxxxxxx>
> Cc: Colin Cross <ccross@xxxxxxxxxxx>
> Cc: Dmitry Shmidt <dimitrysh@xxxxxxxxxx>
> Cc: Todd Kjos <tkjos@xxxxxxxxxx>
> Cc: Christian Poetzsch <christian.potzsch@xxxxxxxxxx>
> Cc: Amit Pundir <amit.pundir@xxxxxxxxxx>
> Cc: Dmitry Torokhov <dmitry.torokhov@xxxxxxxxx>
> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> Cc: Serge E. Hallyn <serge@xxxxxxxxxx>
> Cc: linux-api@xxxxxxxxxxxxxxx
> Signed-off-by: John Stultz <john.stultz@xxxxxxxxxx>
> ---
> v2: Renamed to just CAP_CGROUP_MIGRATE as reccomended by Tejun
> v3: Switched to just using CAP_SYS_RESOURCE as suggested by Michael
> ---
> kernel/cgroup.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 09f84d2..866059a 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -2857,7 +2857,7 @@ static int cgroup_procs_write_permission(struct task_struct *task,
> if (!uid_eq(cred->euid, GLOBAL_ROOT_UID) &&
> !uid_eq(cred->euid, tcred->uid) &&
> !uid_eq(cred->euid, tcred->suid) &&
> - !ns_capable(tcred->user_ns, CAP_CGROUP_MIGRATE))
> + !ns_capable(tcred->user_ns, CAP_SYS_RESOURCE))
> ret = -EACCES;
>
> if (!ret && cgroup_on_dfl(dst_cgrp)) {
>
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/