Re: [PATCH 3/7] seccomp_filter: Enable ftrace-based system callfiltering
From: Frederic Weisbecker
Date: Thu Apr 28 2011 - 11:12:57 EST
On Wed, Apr 27, 2011 at 10:08:47PM -0500, Will Drewry wrote:
> This change adds a new seccomp mode based on the work by
> agl@xxxxxxxxxxxxx This mode comes with a bitmask of NR_syscalls size and
> an optional linked list of seccomp_filter objects. When in mode 2, all
> system calls are first checked against the bitmask to determine if they
> are allowed or denied. If allowed, the list of filters is checked for
> the given syscall number. If all filter predicates for the system call
> match or the system call was allowed without restriction, the process
> continues. Otherwise, it is killed and a KERN_INFO notification is
> posted.
>
> The filter language itself is provided by the ftrace filter engine.
> Related patches tweak to the perf filter trace and free allow the calls
> to be shared. Filters inherit their understanding of types and arguments
> for each system call from the CONFIG_FTRACE_SYSCALLS subsystem which
> predefines this information in syscall_metadata associated enter_event
> (and exit_event) structures.
>
> The result is that a process may reduce its available interfaces to
> the kernel through prctl() without knowing the appropriate system call
> number a priori and with the flexibility of filtering based on
> register-stored arguments. (String checks suffer from TOCTOU issues and
> should be left to LSMs to provide policy for! Don't get greedy :)
>
> A sample filterset for a process that only needs to interact over stdin
> and stdout and exit cleanly is shown below:
> sys_read: fd == 0
> sys_write: fd == 1
> sys_exit_group: 1
>
> The filters may be specified once prior to entering the reduced access
> state:
> prctl(PR_SET_SECCOMP, 2, filters);
Instead of having such multiline filter definition with syscall
names prepended, it would be nicer to make the parsing simplier.
You could have either:
prctl(PR_SET_SECCOMP, mode);
/* Works only if we are in mode 2 */
prctl(PR_SET_SECCOMP_FILTER, syscall_nr, filter);
or:
/*
* If mode == 2, set the filter to syscall_nr
* Recall this for each syscall that need a filter.
* If a filter was previously set on the targeted syscall,
* it will be overwritten.
*/
prctl(PR_SET_SECCOMP, mode, syscall_nr, filter);
One can erase a previous filter by setting the new filter "1".
Also, instead of having a bitmap of syscall to accept. You could
simply set "0" as a filter to those you want to deactivate:
prctl(PR_SET_SECCOMP, 2, 1, 0); <- deactivate the syscall_nr 1
Hm?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/