Re: [RESEND][PATCH v4] cgroup: Use CAP_SYS_RESOURCE to allow a process to migrate other tasks between cgroups
From: Andy Lutomirski
Date: Tue Nov 08 2016 - 18:52:15 EST
On Tue, Nov 8, 2016 at 3:28 PM, John Stultz <john.stultz@xxxxxxxxxx> wrote:
> This patch adds logic to allows a process to migrate other tasks
> between cgroups if they have CAP_SYS_RESOURCE.
>
> In Android (where this feature originated), the ActivityManager tracks
> various application states (TOP_APP, FOREGROUND, BACKGROUND, SYSTEM,
> etc), and then as applications change states, the SchedPolicy logic
> will migrate the application tasks between different cgroups used
> to control the different application states (for example, there is a
> background cpuset cgroup which can limit background tasks to stay
> on one low-power cpu, and the bg_non_interactive cpuctrl cgroup can
> then further limit those background tasks to a small percentage of
> that one cpu's cpu time).
>
> However, for security reasons, Android doesn't want to make the
> system_server (the process that runs the ActivityManager and
> SchedPolicy logic), run as root. So in the Android common.git
> kernel, they have some logic to allow cgroups to loosen their
> permissions so CAP_SYS_NICE tasks can migrate other tasks between
> cgroups.
>
> I feel the approach taken there overloads CAP_SYS_NICE a bit much
> for non-android environments.
>
> So this patch, as suggested by Michael Kerrisk, simply adds a
> check for CAP_SYS_RESOURCE.
>
> I've tested this with AOSP master, and this seems to work well
> as Zygote and system_server already use CAP_SYS_RESOURCE. I've
> also submitted patches against the android-4.4 kernel to change
> it to use CAP_SYS_RESOURCE, and the Android developers just merged
> it.
>
I hate to say it, but I think I may see a problem. Current
developments are afoot to make cgroups do more than resource control.
For example, there's Landlock and there's Daniel's ingress/egress
filter thing. Current cgroup controllers can mostly just DoS their
controlled processes. These new controllers (or controller-like
things) can exfiltrate data and change semantics.
Does anyone have a security model in mind for these controllers and
the cgroups that they're attached to? I'm reasonably confident that
CAP_SYS_RESOURCE is not the answer...