Re: [PATCH v2 2/4] capabilities: Add securebit to restrict userns caps

From: Jonathan Calmels
Date: Mon Jun 10 2024 - 05:41:19 EST


On Sun, Jun 09, 2024 at 09:33:01PM GMT, Serge E. Hallyn wrote:
> On Sun, Jun 09, 2024 at 03:43:35AM -0700, Jonathan Calmels wrote:
> > This patch adds a new capability security bit designed to constrain a
> > task’s userns capability set to its bounding set. The reason for this is
> > twofold:
> >
> > - This serves as a quick and easy way to lock down a set of capabilities
> > for a task, thus ensuring that any namespace it creates will never be
> > more privileged than itself is.
> > - This helps userspace transition to more secure defaults by not requiring
> > specific logic for the userns capability set, or libcap support.
> >
> > Example:
> >
> > # capsh --secbits=$((1 << 8)) --drop=cap_sys_rawio -- \
> > -c 'unshare -r grep Cap /proc/self/status'
> > CapInh: 0000000000000000
> > CapPrm: 000001fffffdffff
> > CapEff: 000001fffffdffff
> > CapBnd: 000001fffffdffff
> > CapAmb: 0000000000000000
> > CapUNs: 000001fffffdffff
>
> But you are not (that I can see, in this or the previous patch)
> keeping SECURE_USERNS_STRICT_CAPS in securebits on the next
> level unshare. Though I think it's ok, because by then both
> cap_userns and cap_bset are reduced and cap_userns can't be
> expanded. (Sorry, just thinking aloud here)

Right this is safe to reset, but maybe we do keep it if the secbit is
locked? This is kind of a special case compared to the other bits.

> > + /* Limit userns capabilities to our parent's bounding set. */
>
> In the case of userns_install(), it will be the target user namespace
> creator's bounding set, right? Not "our parent's"?

Good point, I should reword this comment.