Re: [patch 2.1.97] more capabilities support

Alexander Kjeldaas (astor@guardian.no)
Tue, 28 Apr 1998 17:49:31 +0200


On Mon, Apr 27, 1998 at 02:26:22PM +0200, Andrej Presern wrote:
> Theodore Y. Ts'o wrote:
> >
> > Huh? That's hardly a pure-capability system. In a pure capaibility
> > system, you have something that is effectively like a "key", which can
> > get passed around between processes, and a privileged program is merely
> > a program that has several keys that confer more abilities than the
> > standard default capability.
>
> Don't you think that the two syscall capabilities are essentially two
> 'keys' that 'open different syscalls'? I designed the syscall capability
> the way I did exactly to demonstrate that you don't neccessarily have to
> pass the capability like a fd when you invoke it, keeping existing
> syscall interfaces unchanged.
>
> I stated in the mail that for starters I don't want a process to be able
> to pass the syscall capability around to other processes (which it could
> nonetheless if we don't tie the memory containing the copy of the
> sys_call_table to the process that requested it).
>
> You don't need a special authority granting daemon to have pure
> capabilities because every object can have its own set that it can give
> away. In the case of a syscall capability that I described, the kernel
> gives a single syscall capability to all processes by default, and if
> the process wants it can create its own subset of the system calls that
> it received. As I already stated, if we don't tie this syscall
> capability to the process (that is if we don't free the memory that it
> occupies when the process exits), a process could as well communicate it
> to other processes (the reference is just 4 bytes) if, of course, it
> includes the authority to call system calls that can be used to
> communicate with other processes.
>

What you are suggesting is simply a less useful version of POSIX
capabilities. They are

1) less expressive

I mentioned a few things you couldn't express with your model in
another post, but let me just mention that you can express
_everything_ you can do with _your_ model using POSIX capabilities.
I have already described this in an earlier mail when someone
wanted a CAP_NETWORK capability to be able to restrict access to
the network to a process. This requires the following changes to
the kernel which I didn't want to do to avoid complexity at this
time.
1.1) You have to read some capabilities as "set" in the allowed
set when reading them from the file system.
1.2) You have to add credentials checks to the kernel that
currently don't exist.
The reason you don't have CAP_EXEC and CAP_NETWORK isn't that it is
inherently difficult to do, but that we want to take one step at a
time. Partitioning the root privilege is a well-defined step.

2) not pure

I can emulate your sys_call_table with a 183 bit long integer
with each bit corresponding to being allowed to invoke a system
call. How can you argue that POSIX capabilities are inherently
broken when the similarities between POSIX and your suggestion are
so obvious?

3) they require 60 times the memory, and
4) are harder to administer, and
5) are probably slower

I think simply the cache effects of having different
system_call_tables makes them slower than looking up a bit in a
table. However, if this isn't true for some reason, there is
nothing stopping us from implementing for instance CAP_SYS_RAWIO
by removing iopl and ioperm from the system call table of the
process.

[...]
> ....
> /* program init */
> install_sys_call_table( without_execve );
>
> /* this is the dangerous part of the code where the attacker can get in
> */
> parse_input_from_user();
> ....
> /* this is the part of the program that needs authority to call execve()
> */
> install_sys_call_table( with_execve );
> execve( ... );
> install_sys_call_table( without_execve );
> ....

The above is _exactly_ the same as the following which uses
POSIX capabilities:

#include <sys/capability.h>
/* program init */
cap_t me, me_execless;
cap_value_t caps[] = {CAP_EXEC};
me = cap_get_proc();
me_execless = cap_dup(me);
cap_set_flag(me_execless, CAP_EFFECTIVE, 1, caps, CAP_CLEAR);
cap_set_proc(me_execless);

/* this is the dangerous part of the code where the attacker can get in
*/
parse_input_from_user();
....
/* this is the part of the program that needs authority to call execve()
*/
cap_set_proc(me);
execve( ... );
cap_set_proc(me_execless);
....
cap_free(me);
cap_free(me_execless);

[...]
> The other way to abuse the program is that the attacker installs the
> syscall capability that contains execve() by itself before calling
> execve(). But this means that the attacker must have such a capability.
> And because it can't just produce one by itself, and because it can't
> get one from the system (it can only get a copy of the currently
> installed one) the only way to get it is by stealing it from the
> attacked process (and to do that it must know exactly where the attacked
> process holds it), which complicates things even more.
>

If you want it to be difficult to steal the capability from the
running process, you will have to _design_ it to be difficult. Until I
see some evidence suggesting it is difficult, I'll assume it is as
easy as stealing POSIX capabilities.

astor

-- 
 Alexander Kjeldaas, Guardian Networks AS, Trondheim, Norway
 http://www.guardian.no/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu