Re: [RFC] capabilities: add capability cgroup controller
From: Topi Miettinen
Date: Tue Jun 21 2016 - 17:36:29 EST
On 06/21/16 15:45, Serge E. Hallyn wrote:
> Quoting Topi Miettinen (toiwoton@xxxxxxxxx):
>> On 06/19/16 20:01, serge@xxxxxxxxxx wrote:
>>> apologies for top posting, this phone doesn't support inline)
>>>
>>> Where are you preventing less privileged tasks from limiting the caps of a more privileged task? It looks like you are relying on the cgroupfs for that?
>>
>> I didn't think that aspect. Some of that could be dealt with by
>> preventing tasks which don't have CAP_SETPCAP to make other tasks join
>> or set the bounding set. One problem is that the privileges would not be
>> checked at cgroup.procs open(2) time but only when writing. In general,
>> less privileged tasks should not be able to gain new capabilities even
>> if they were somehow able to join the cgroup and also your case must be
>> addressed in full.
>>
>>>
>>> Overall I'm not a fan of this for several reasons. Can you tell us precisely what your use case is?
>>
>> There are two.
>>
>> 1. Capability use tracking at cgroup level. There is no way to know
>> which capabilities have been used and which could be trimmed. With
>> cgroup approach, we can also keep track of how subprocesses use
>> capabilities. Thus the administrator can quickly get a reasonable
>> estimate of a bounding set just by reading the capability.used file.
>
> So to estimate the privileges needed by an application? Note this
> could also be done with something like systemtap, but that's not as
> friendly of course.
>
I've used systemtap to track how a single process uses capabilities, but
I can imagine that without the cgroup, using it to track several
subprocesses could be difficult.
> Keeping the tracking part separate from enforcement might be worthwhile.
> If you wanted to push that part of the patchset, we could keep
> discussing the enforcement aspect separately.
>
OK, I'll prepare the tracking part first.
>> 2. cgroup approach to capability management. Currently the capabilities
>> are inherited with bounding set and ambient capabilities taking their
>> part. With cgroups, additional limits can be set which apply to the
>> whole group. I admit that the difference to the current model is small.
>>
>> Could you list the several reasons you mentioned?
>
> Should have done it sunday while my mind was clear on it
>
> The first is that while we normally think of preventing a less
> privileged task from becoming more privileged, it can be just as
> dangerous to allow a less privileged task from robbing a more privileged
> task of some capability. See in particular the sendmail capability
> story. By allowing an unprivileged task to run a setuid-root task in an
> unexpected configuration - namely, denying it the ability to setuid(),
> it was possible to get a root owned task doing your bidding.
>
> So that's why I'm particularly concerned about allowing cgroupfs dac
> permissions to dictate who gets to say what privileges other tasks on
> the system can get.
>
It could be especially tricky if the privileges are suddenly lost while
the processs is already executing.
> Another reason is simply that the capability calculation scheme is
> for historical reasons already quite complicated. So if there is
> something worthwhile to add we can discuss, but it'll take a compelling
> otherwise-unsolvable use case to convince me we should complicate it
> further.
>
> In general, capabilites can be very cleanly predicted by looking at
> the parent task and the file being executed. Adding a cgroup into
> the mix allows basically any random task to sneak in, change the
> setting, and make a process unexpectedly not get a privileged on a
> new execve when it did get it on the previous execve.
>
> As amorgan will point out, posix caps are meant to be purely orthogonal
> to dac. We have hooks in place to make setuid work, but those can be
> shut off to get a system where uid root is noone special (other than
> owning system files). So again, allowing a root user through cgroupfs
> access to change the bounding set for other tasks flies in the face of
> that. (we're already smudging that picture with the user-namespaced
> filecaps, though trying not to)
>
> -serge
>
Right. I'm almost convinced that the capability management part doesn't
make much sense.
-Topi