Re: [RFC PATCHv2 00/11] Adding FreeBSD's Capsicum security framework

From: Paolo Bonzini
Date: Mon Jul 28 2014 - 08:31:31 EST


Il 26/07/2014 23:04, Eric W. Biederman ha scritto:
>> The most significant aspect of Capsicum is associating *rights* with
>> (some) file descriptors, so that the kernel only allows operations on an
>> FD if the rights permit it. This allows userspace applications to
>> sandbox themselves by tightly constraining what's allowed with both
>> input and outputs; for example, tcpdump might restrict itself so it can
>> only read from the network FD, and only write to stdout.
>>
>> The kernel thus needs to police the rights checks for these file
>> descriptors (referred to as 'Capsicum capabilities', completely
>> different than POSIX.1e capabilities), and the best place to do this is
>> at the points where a file descriptor from userspace is converted to a
>> struct file * within the kernel.
>>
>> [Policing the rights checks anywhere else, for example at the system
>> call boundary, isn't a good idea because it opens up the possibility
>> of time-of-check/time-of-use (TOCTOU) attacks [2] where FDs are
>> changed (as openat/close/dup2 are allowed in capability mode) between
>> the 'check' at syscall entry and the 'use' at fget() invocation.]
>>
>> However, this does lead to quite an invasive change to the kernel --
>> every invocation of fget() or similar functions (fdget(),
>> sockfd_lookup(), user_path_at(),...) needs to be annotated with the
>> rights associated with the specific operations that will be performed on
>> the struct file. There are ~100 such invocations that need
>> annotation.
>
> And it is silly. Roughly you just need a locking version of
> fcntl(F_SETFL).
>
> That is make the restriction in the struct file not in the fd to file
> lookup.

No, they have to be in the file descriptor. The same file descriptor
can be dup'ed and passed with different capabilities to different processes.

Say you pass an eventfd to a process with SCM_RIGHTS, and you want to
only allow the process to write to it.

>> 4) New System Calls
>> -------------------
>>
>> To allow userspace applications to access the Capsicum capability
>> functionality, I'm proposing two new system calls: cap_rights_limit(2)
>> and cap_rights_get(2). I guess these could potentially be implemented
>> elsewhere (e.g. as fcntl(2) operations?) but the changes seem
>> significant enough that new syscalls are warranted.
>>
>> [FreeBSD 10.x actually includes six new syscalls for manipulating the
>> rights associated with a Capsicum capability -- the capability rights
>> can police that only specific fcntl(2) or ioctl(2) commands are
>> allowed, and FreeBSD sets these with distinct syscalls.]
>
> ioctls? In a sandbox? Ick.

KVM? X11? Both of them use loads of ioctls. I'm less sure of the
benefit of picking which fcntls to allow.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/