Re: [RFC][PATCH v3] cgroup: Use CAP_SYS_RESOURCE to allow a process to migrate other tasks between cgroups
From: Serge E. Hallyn
Date: Fri Oct 21 2016 - 09:42:05 EST
Quoting John Stultz (john.stultz@xxxxxxxxxx):
> This patch adds logic to allows a process to migrate other tasks
> between cgroups if they have CAP_SYS_RESOURCE.
Hi,
fwiw this seems to me a reasonable choice, that is, giving a task
CAP_SYS_RESOURCE to be allow it to write to the procs file doesn't
seem like giving it too much. +1 from me. So long as the Android
folks don't have a reason why this won't work for them,
Acked-by: Serge Hallyn <serge@xxxxxxxxxx>
-serge
> In Android (where this feature originated), the ActivityManager tracks
> various application states (TOP_APP, FOREGROUND, BACKGROUND, SYSTEM,
> etc), and then as applications change states, the SchedPolicy logic
> will migrate the application tasks between different cgroups used
> to control the different application states (for example, there is a
> background cpuset cgroup which can limit background tasks to stay
> on one low-power cpu, and the bg_non_interactive cpuctrl cgroup can
> then further limit those background tasks to a small percentage of
> that one cpu's cpu time).
>
> However, for security reasons, Android doesn't want to make the
> system_server (the process that runs the ActivityManager and
> SchedPolicy logic), run as root. So in the Android common.git
> kernel, they have some logic to allow cgroups to loosen their
> permissions so CAP_SYS_NICE tasks can migrate other tasks between
> cgroups.
>
> I feel the approach taken there overloads CAP_SYS_NICE a bit much,
> and is maybe more complicated then needed.
>
> So this patch, as suggested by Michael Kerrisk, simply adds a
> check for CAP_SYS_RESOURCE.
>
> I've tested this with AOSP master, and this seems to work well
> as Zygote and system_server already use CAP_SYS_RESOURCE. I've
> also submitted patches against the android-4.4 kernel to change
> it to use CAP_SYS_RESOURCE, and the Android developers have
> seemed ok with this change.
>
> Thouhts and feedback would be appreciated!
>
> Cc: Tejun Heo <tj@xxxxxxxxxx>
> Cc: Li Zefan <lizefan@xxxxxxxxxx>
> Cc: Jonathan Corbet <corbet@xxxxxxx>
> Cc: cgroups@xxxxxxxxxxxxxxx
> Cc: Android Kernel Team <kernel-team@xxxxxxxxxxx>
> Cc: Rom Lemarchand <romlem@xxxxxxxxxxx>
> Cc: Colin Cross <ccross@xxxxxxxxxxx>
> Cc: Dmitry Shmidt <dimitrysh@xxxxxxxxxx>
> Cc: Todd Kjos <tkjos@xxxxxxxxxx>
> Cc: Christian Poetzsch <christian.potzsch@xxxxxxxxxx>
> Cc: Amit Pundir <amit.pundir@xxxxxxxxxx>
> Cc: Dmitry Torokhov <dmitry.torokhov@xxxxxxxxx>
> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> Cc: Serge E. Hallyn <serge@xxxxxxxxxx>
> Cc: linux-api@xxxxxxxxxxxxxxx
> Signed-off-by: John Stultz <john.stultz@xxxxxxxxxx>
> ---
> v2: Renamed to just CAP_CGROUP_MIGRATE as reccomended by Tejun
> v3: Switched to just using CAP_SYS_RESOURCE as suggested by Michael
> ---
> kernel/cgroup.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 09f84d2..866059a 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -2857,7 +2857,7 @@ static int cgroup_procs_write_permission(struct task_struct *task,
> if (!uid_eq(cred->euid, GLOBAL_ROOT_UID) &&
> !uid_eq(cred->euid, tcred->uid) &&
> !uid_eq(cred->euid, tcred->suid) &&
> - !ns_capable(tcred->user_ns, CAP_CGROUP_MIGRATE))
> + !ns_capable(tcred->user_ns, CAP_SYS_RESOURCE))
> ret = -EACCES;
>
> if (!ret && cgroup_on_dfl(dst_cgrp)) {
> --
> 2.7.4