Re: [PATCH] Capabilities, this time in elf section

Albert D. Cahalan (acahalan@cs.uml.edu)
Fri, 16 Apr 1999 00:04:34 -0400 (EDT)


Richard Gooch writes:
> Hi, David.
>> Hi Richard,

>>> No! I already made this point last week. You only need suid-root to
>>> *grant* caps. Any binary can have its CAP ELF section with a mask of
>>> caps that are *removed*. There is no privilege needed for that.

This is a perfectly good idea, though not in the current design.

>> Well, for one, the standard for capabilities doesn't exactly have a 'deny
>> mask' as such; it would have to be done in the interaction of the current
>> process' inheritable set and the file's inheritable set. You _could_ just
>> put a flat 'deny' mask in there, but you get further from true
>> capabilities, and the file owner can modify them.

Please don't say "true capabilities", which means something very different.
With true capabilities, we just get rid of the whole damn filesystem.
After all, the filesystem namespace is the source of many problems.
Intead of a filesystem, we can just pass around object handles. This is
totally weird of course, insanely secure, and not even close to POSIX.
All the world breaks, except perhaps the "Hello, world." program.
Processes should be persistant too, even accross reboots. (otherwise,
you would lose all your data -- remember there is no filesystem)
BTW, there is no need for ACLs on such a system.

What we have is a capability list system using a bitmap for performance.
I'd rather see it called by the old "privileges" name to reduce confusion.
The problem with this and everything else we have is that we will always
want more bits, more groups, more ACL entries... and checking all that
security data requires something like O(pow(n,k)) time.

Anyway...

You certainly could have a deny mask. I like that very much. It should
prevent the exec, not just clear some bits. It isn't good to have
privileged executables crashing half-way through.

> I'm not necessarily advocating "capabilities light". I'm definately
> advocating a "capabilities practical" scheme, which obviously includes
> NFS and similar support.
>
> But I'm not convinced that "capabilities practical" means
> "capabilities light". If it does mean that, so be it. Mind you, I have
> yet to see this proven.

I agree with this very much. I generally agree with Richard Gooch
for some odd reason. He must be right. :-)

>>> Well, this isn't the case. If you read one of my earlier messages, I
>>> suggested putting the suid-root+CAP ELF header stuff into user
>>> space. So that means when you run the setcap programme, you either
>>> edit the binary (if it's statically linked) to insert some magic code,
>>> or ensure the dynamic linker has the magic code.

This is a very interesting idea. I hesitate to trust the startup
code that much, but this method might be best. It seems to screw up an
idea I had for getting rid of some of the UID 0 special cases though.

> In addition, my scheme gives you a pretty flexible system for
> increasing security on non-caps kernels. You can easily set things up
> so that when root runs another binary, the uid and euid are set to
> nobody, if that binary has no capability header, or doesn't have any
> inheritable caps set. The only thing you can't control is statically
> linked binaries without the magic code. But if you're setting up a
> secure system, you'd relink those anyway.

The static binaries problem is very annoying. Maybe a static binary
could be embedded inside another binary.

> This does two things:
>
> - it prevents the problem of luser running /bin/sh (which is suid-root
> so a cap-enabled kernel will allow inherits) and getting a root shell
>
> - it provides *some* of the security benefits of file-based
> inheritance, so that any dynamically linked binary (or a new
> statically linked binary) can elect to drop root privileges.
>
> For the non-caps kernel case, the magic code can be used to prevent
> some/most/all binaries running as root.
>
> And remember, just because you have all these suid-root binaries
> floating around, it doesn't mean root can get cracked, or that you
> have a root account.

We need to develop this in secret, then give the head OpenBSD developer
an account. He might just well die when he sees that damn near everything
appears to be setuid-root.

> I hope I've explained how my scheme works.

It is very interesting. It could be the best solution yet, though I
don't wish to claim that without a bit more consideration.

-- why this system may seem odd --

I have noticed that the UID 0 hack in the kernel is very gross.
The kernel pretends that the file had fI and fE set. Ugh.

Then consider the bits set in a normal user process. You can see
them in /proc/*/status. Odd, isn't it?

Now look at the pP etc. calculation. The _only_ capability sets
that provide influence are pI and the file-based sets.

This means that fI and fP are essentially the same thing! You can
set a bit in either one, and the effect will be the same.

Clearly this is not the intent. It seems that the pI bits were
intended to be _unset_ for normal processes. This would make fP
provide setuid-like effects and make ~fI be a mask used to make
sure admins use authorized tools for their admin work.

If I am right about this, then we should make pI unset for normal
processes and assume that unmarked executables have a full fI.
(only assume fI is full when running a somewhat normal system)
With such a change, plain setuid root executables would get pI
filled; this lets the full fI take effect.

If you do the above but don't assume fI is full, then admins can
only exercise their powers when running specially marked executables.
This is nicely secure, but would make a terrible default.

-- bit allocation --

Some of the undefined bits in pI should be _set_ and reserved
for a time when Linux makes "normal" user rights be capabilities.
Let's say that bits 32..63 are for network _client_ use, plain chmod,
the fork() and exec() system calls, etc.

Some of the undefined bits could be reserved for userspace.
For example, there could be an X admin capability. The kernel
would handle the inheritance of this bit, but would not use it.
The X server could check the bit on local connections.

-- new bitmap proposal --

I propose min and max data as follows:

pm Always set these.
pM If ~pM would be in pP' then after an exec, then return -EPERM.
fm If these would not be set after an exec, then return -EPERM.
fM Remove unset bits from pm. (?)

Calculation becomes something like this:

new_pP = old_fP | (old_fI & old_pI) | old_pm;
if(new_pP & ~old_pM) return -EPERM;
if((new_pP & old_pm) != old_pm) return -EPERM;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/