Re: [RESEND][PATCH v4] cgroup: Use CAP_SYS_RESOURCE to allow a process to migrate other tasks between cgroups

From: Alexei Starovoitov
Date: Tue Nov 08 2016 - 19:04:41 EST


On Tue, Nov 08, 2016 at 03:51:40PM -0800, Andy Lutomirski wrote:
> On Tue, Nov 8, 2016 at 3:28 PM, John Stultz <john.stultz@xxxxxxxxxx> wrote:
> > This patch adds logic to allows a process to migrate other tasks
> > between cgroups if they have CAP_SYS_RESOURCE.
> >
> > In Android (where this feature originated), the ActivityManager tracks
> > various application states (TOP_APP, FOREGROUND, BACKGROUND, SYSTEM,
> > etc), and then as applications change states, the SchedPolicy logic
> > will migrate the application tasks between different cgroups used
> > to control the different application states (for example, there is a
> > background cpuset cgroup which can limit background tasks to stay
> > on one low-power cpu, and the bg_non_interactive cpuctrl cgroup can
> > then further limit those background tasks to a small percentage of
> > that one cpu's cpu time).
> >
> > However, for security reasons, Android doesn't want to make the
> > system_server (the process that runs the ActivityManager and
> > SchedPolicy logic), run as root. So in the Android common.git
> > kernel, they have some logic to allow cgroups to loosen their
> > permissions so CAP_SYS_NICE tasks can migrate other tasks between
> > cgroups.
> >
> > I feel the approach taken there overloads CAP_SYS_NICE a bit much
> > for non-android environments.
> >
> > So this patch, as suggested by Michael Kerrisk, simply adds a
> > check for CAP_SYS_RESOURCE.
> >
> > I've tested this with AOSP master, and this seems to work well
> > as Zygote and system_server already use CAP_SYS_RESOURCE. I've
> > also submitted patches against the android-4.4 kernel to change
> > it to use CAP_SYS_RESOURCE, and the Android developers just merged
> > it.
> >
>
> I hate to say it, but I think I may see a problem. Current
> developments are afoot to make cgroups do more than resource control.
> For example, there's Landlock and there's Daniel's ingress/egress
> filter thing. Current cgroup controllers can mostly just DoS their
> controlled processes. These new controllers (or controller-like
> things) can exfiltrate data and change semantics.
>
> Does anyone have a security model in mind for these controllers and
> the cgroups that they're attached to? I'm reasonably confident that
> CAP_SYS_RESOURCE is not the answer...

and specifically the answer is... ?
Also would be great if you start with specifying the question first
and the problem you're trying to solve.