Re: LPC 2020 Hackroom Session: summary and next steps for isolated user namespaces

From: Eric W. Biederman
Date: Wed Oct 14 2020 - 15:46:38 EST


"Serge E. Hallyn" <serge@xxxxxxxxxx> writes:

> On Mon, Oct 12, 2020 at 12:01:09AM -0500, Eric W. Biederman wrote:
>> Andy Lutomirski <luto@xxxxxxxxxx> writes:
>>
>> > On Sun, Oct 11, 2020 at 1:53 PM Josh Triplett <josh@xxxxxxxxxxxxxxxx> wrote:
>> >>
>> >> On Fri, Oct 09, 2020 at 11:26:06PM -0500, Serge E. Hallyn wrote:
>> >> > > 3. Find a way to allow setgroups() in a user namespace while keeping
>> >> > > in mind the case of groups used for negative access control.
>> >> > > This was suggested by Josh Triplett and Geoffrey Thomas. Their idea was to
>> >> > > investigate adding a prctl() to allow setgroups() to be called in a user
>> >> > > namespace at the cost of restricting paths to the most restrictive
>> >> > > permission. So if something is 0707 it needs to be treated as if it's 0000
>> >> > > even though the caller is not in its owning group which is used for negative
>> >> > > access control (how these new semantics will interact with ACLs will also
>> >> > > need to be looked into).
>> >> >
>> >> > I should probably think this through more, but for this problem, would it
>> >> > not suffice to add a new prevgroups grouplist to the struct cred, maybe
>> >> > struct group_info *locked_groups, and every time an unprivileged task creates
>> >> > a new user namespace, add all its current groups to this list?
>> >>
>> >> So, effectively, you would be allowed to drop permissions, but
>> >> locked_groups would still be checked for restrictions?
>> >>
>> >> That seems like it'd introduce a new level of complexity (a new facet of
>> >> permission) to manage. Not opposed, but it does seem more complex than
>> >> just opting out of using groups for negative permissions.
>
> Yeah, it would, but I basically hoped that we could catch most of this at
> e.g. generic_permission(), and/or we could introduce a helper which
> automatically adds a check for permission denied from locked_groups, so
> it shouldn't be too wide-spread. If it does end up showing up all over
> the place, then that's a good reason not to do this.
>
>> > Is there any context other than regular UNIX DAC in which groups can
>> > act as negative permissions or is this literally just an issue for
>> > files with a more restrictive group mode than other mode?
>>
>> Just that.
>>
>> The ideas kicked around in the conversation were some variant of having
>> a sysctl that says "This system never uses groups for negative
>> permissions".
>>
>> It was also suggested that if the sysctl was set the the permission
>> checks would be altered such that even if someone tried to set a
>> negative permission, the more liberal permissions of other would be used
>> instead.
>
> So then this would touch all the same code points which the
> locked_groups approach would have to touch?

No locked_groups would touch in_group_p and set_groups. Especially what
set_groups means in that context. It would have to handle what happens
when you start accumulating locked groups (because of multiple
namespaces). How you dedup locked groups etc.

I was not able to convince myself that not being able to clear out
groups that a user has when they create a user namespace won't cause
other problems. Especially as user namespaces had been in use for a
while at that point.

Not supporting negative groups would touch acl_permission and modify it
like:

static int acl_permission_check(struct inode *inode, int mask)
{
[irrelveant code snipped]
/* Only RWX matters for group/other mode bits */
mask &= 7;

/*
* Are the group permissions different from
* the other permissions in the bits we care
* about? Need to check group ownership if so.
*/
if (mask & (mode ^ (mode >> 3))) {
- if (in_group_p(inode->i_gid))
+ if (in_group_p(inode->i_gid) &&
+ (!sysctl_force_positive_groups ||
+ (mask & ~(mode >> 3)))
mode >>= 3;
}

/* Bits in 'mode' clear that we require? */
return (mask & ~mode) ? -EACCES : 0;
}


I don't know that we need to do that. But it would might be a good way
of flushing out the issues.


>> Given that creating /etc/subgid is effectively opting out of negative
>> permissions already have a sysctl that says that upfront feels like a
>> very clean solution.
>>
>> Eric
>
> That feels like a cop-out to me. If some young admin at Roxxon Corp decides
> she needs to run a container, so installs subuid package and sets that sysctl,
> how does she know whether or not some previous admin, who has since retired and
> did not keep good docs, set things up so that a negative acl is keeping nginx
> from reading some supersecret doc?
>
> Now personally I'm not a great believer in the negative acls so I think the
> above is a very unlikely scenario, but if we're going to worry about it, then
> we should worry about it :)

There is a different between guaranting we don't break existing setups
when a new feature is enabled, and supporting old very rare setups when
a new feature is enabled.


> "Click this button if noone has ever used feature X on this server"

My current thinking is that if we already don't honor negative groups
when /etc/subgid exists it would not hurt to make that more explicit.


>From what we could tell at the time people that know negative groups are
honored much less systems that actually use negative groups are
exceedingly rare.

Eric