Re: LPC 2020 Hackroom Session: summary and next steps for isolated user namespaces

From: Giuseppe Scrivano
Date: Thu Apr 22 2021 - 05:18:14 EST


Snaipe <snaipe@xxxxxxxxxx> writes:

> "Giuseppe Scrivano" <gscrivan@xxxxxxxxxx> writes:
>>>> >> instead of a prctl, I've added a new mode to /proc/PID/setgroups that
>>>> >> allows setgroups in a userns locking the current gids.
>>>> >>
>>>> >> What do you think about using /proc/PID/setgroups instead of a new
>>>> >> prctl()?
>>>> >
>>>> > It's better than not having it, but two concerns -
>>>> >
>>>> > 1. some userspace, especially testsuites, could become confused by the fact
>>>> > that they can't drop groups no matter how hard they try, since these will all
>>>> > still show up as regular groups.
>>>>
>>>> I forgot to send a link to a second patch :-) that completes the feature:
>>>> https://github.com/giuseppe/linux/commit/1c5fe726346b216293a527719e64f34e6297f0c2
>>>>
>>>> When the new mode is used, the gids that are not known in the userns do
>>>> not show up in userspace.
>>>
>>> Ah, right - and of course those gids better not be mapped into the namespace :)
>>>
>>> But so, this is the patch you said you agreed was not worth the extra
>>> complexity?
>>
>> yes, these two patches are what looked too complex at that time. The
>> problem still exists though, we could perhaps reconsider if the
>> extra-complexity is acceptable to address it.
>
> Hey Folks, sorry for necro-bumping, but I've found this discussion
> while searching for this specific issue, and it seems like the most
> recent relevant discussion on the matter. I'd like to chime in with
> our personal experience.
>
> We have a tool[1] that allows unprivileged use of namespaces
> (when using a userns, which is the default).
>
> The primary use-case of said tool is lightweight containerization,
> but we're also using it for other mundane usages, like a better
> substitute for fakeroot to build and package privileged software
> (e.g. sudo or ping, which needs to be installed with special
> capabilities) unprivileged, or to copy file trees that are owned by
> the user or sub-ids.
>
> For the first use-case, it's always safe to drop unmapped groups,
> because the target rootfs is always owned by the user or its sub-ids.
>
> For the other use-cases, this is more problematic, as you're all
> well-aware of. Our position right now is that the tool will always
> allow setgroups in user namespace, and that it's not safe to use on
> systems that rely on negative access groups.
>
> I think that something that's not mentioned is that if a user setgroups
> to a fixed list of subgids, dropping all unmapped gids, they don't just
> gain the ability to access these negative-access files, they also lose
> legitimate access to files that their unmapped groups allow them to
> access. This is fine for our first use-case, but a bit surprising for
> the second one -- and since setgroups never lets us keep unmapped gids,
> we have no way to keep these desired groups.
>
> From a first glance, a sysctl that explicitly controls that would not
> address the above problem, but keeping around the original group list
> of the owner of the user ns would have the desired semantics.
>
> Giuseppe's patch seems to address this use case, which would personally
> make me very happy.
>
> [1]: https://github.com/aristanetworks/bst

thanks for the feedback. We are still facing the issue with rootless
Podman, and these patches (listed here so you won't need to dig into archives):

https://github.com/giuseppe/linux/commit/7e0701b389c497472d11fab8570c153a414050af
https://github.com/giuseppe/linux/commit/1c5fe726346b216293a527719e64f34e6297f0c2

would solve the issue for us as well and we can use setgroups within a
user namespace in a safe way.

Any comments on this approach? Could we move forward with it?

Regards,
Giuseppe