Re: [PATCH 3/3] capabilities: add cap userns sysctl mask

From: Jarkko Sakkinen
Date: Tue May 21 2024 - 10:46:12 EST


On Tue May 21, 2024 at 5:29 PM EEST, Tycho Andersen wrote:
> On Tue, May 21, 2024 at 01:12:57AM +0300, Jarkko Sakkinen wrote:
> > On Tue May 21, 2024 at 12:13 AM EEST, Tycho Andersen wrote:
> > > On Mon, May 20, 2024 at 12:25:27PM -0700, Jonathan Calmels wrote:
> > > > On Mon, May 20, 2024 at 07:30:14AM GMT, Tycho Andersen wrote:
> > > > > there is an ongoing effort (started at [0]) to constify the first arg
> > > > > here, since you're not supposed to write to it. Your usage looks
> > > > > correct to me, so I think all it needs is a literal "const" here.
> > > >
> > > > Will do, along with the suggestions from Jarkko
> > > >
> > > > > > + struct ctl_table t;
> > > > > > + unsigned long mask_array[2];
> > > > > > + kernel_cap_t new_mask, *mask;
> > > > > > + int err;
> > > > > > +
> > > > > > + if (write && (!capable(CAP_SETPCAP) ||
> > > > > > + !capable(CAP_SYS_ADMIN)))
> > > > > > + return -EPERM;
> > > > >
> > > > > ...why CAP_SYS_ADMIN? You mention it in the changelog, but don't
> > > > > explain why.
> > > >
> > > > No reason really, I was hoping we could decide what we want here.
> > > > UMH uses CAP_SYS_MODULE, Serge mentioned adding a new cap maybe.
> > >
> > > I don't have a strong preference between SETPCAP and a new capability,
> > > but I do think it should be just one. SYS_ADMIN is already god mode
> > > enough, IMO.
> >
> > Sometimes I think would it make more sense to invent something
> > completely new like capabilities but more modern and robust, instead of
> > increasing complexity of a broken mechanism (especially thanks to
> > CAP_MAC_ADMIN).
> >
> > I kind of liked the idea of privilege tokens both in Symbian and Maemo
> > (have been involved professionally in both). Emphasis on the idea not
> > necessarily on implementation.
> >
> > Not an LSM but like something that you could use in the place of POSIX
> > caps. Probably quite tedious effort tho because you would need to pull
> > the whole industry with the new thing...
>
> And then we have LSM hooks, (ns_)capable(), __secure_computing() plus
> a new set of hooks for this new thing sprinkled around. I guess
> kernel developers wouldn't be excited about it, let alone the rest of
> the industry :)
>
> Thinking out loud: I wonder if fixing the seccomp TOCTOU against
> pointers would help here. I guess you'd still have issues where your
> policy engine resolves a path arg to open() and that inode changes
> between the decision and the actual vfs access, you have just changed
> the TOCTOU.
>
> Or even scarier: what if you could change the return value at any
> kprobe? :)

I had one crazy idea related to seccomp filters once.

What if there was way to compose tokens that would be just a seccomp
filter like the one that you pass to PR_SET_SECCOMP but presented with a
file descriptor?

Then you could send these with SCM_RIGHTS to other processes and they
could upgrade their existing filter with them. So it would be a kind of
extension mechanism for a seccomp filter.

Not something I'm seriously suggesting but though to flush this out now
that we are on these topics anyhow ;-)

> Tycho

PS. Sorry if my language was a bit harsh earlier but I think I had also
a point related to at least to the patch set presentation. I.e. you
are very precise describing the mechanism but motivation and bringing
topic somehow to a context is equally important :-)

BR, Jarkko