Re: [PATCH v2 2/4] seccomp: Add sysctl to configure actions that should be logged

From: Kees Cook
Date: Tue Feb 07 2017 - 19:31:18 EST


On Thu, Feb 2, 2017 at 9:37 PM, Tyler Hicks <tyhicks@xxxxxxxxxxxxx> wrote:
> Administrators can write to this sysctl to set the maximum seccomp
> action that should be logged. Any actions with values greater than
> what's written to the sysctl will not be logged.
>
> For example, all SECCOMP_RET_KILL, SECCOMP_RET_TRAP, and
> SECCOMP_RET_ERRNO actions would be logged if "errno" were written to the
> sysctl. SECCOMP_RET_TRACE and SECCOMP_RET_ALLOW actions would not be
> logged since their values are higher than SECCOMP_RET_ERRNO.
>
> The path to the sysctl is:
>
> /proc/sys/kernel/seccomp/max_action_to_log

/me looks for new bikeshed paint.

How about .../seccomp/action_log ? (And a corresponding
s/max_action_to_log/action_log/, if that looks readable...) I think
four words is just too long. :)

> The actions_avail sysctl can be read to discover the valid action names
> that can be written to the max_action_to_log sysctl. The actions_avail
> sysctl is also useful in understanding the ordering of actions used when
> deciding the maximum action to log.
>
> The default setting for the sysctl is to only log SECCOMP_RET_KILL
> actions which matches the existing behavior.
>
> There's one important exception to this sysctl. If a task is
> specifically being audited, meaning that an audit context has been
> allocated for the task, seccomp will log all actions other than
> SECCOMP_RET_ALLOW despite the value of max_action_to_log. This exception
> preserves the existing auditing behavior of tasks with an allocated
> audit context.
>
> Signed-off-by: Tyler Hicks <tyhicks@xxxxxxxxxxxxx>
> ---
> include/linux/audit.h | 6 +--
> kernel/seccomp.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++--
> 2 files changed, 112 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/audit.h b/include/linux/audit.h
> index f51fca8d..e0d95fc 100644
> --- a/include/linux/audit.h
> +++ b/include/linux/audit.h
> @@ -315,11 +315,7 @@ void audit_core_dumps(long signr);
>
> static inline void audit_seccomp(unsigned long syscall, long signr, int code)
> {
> - if (!audit_enabled)
> - return;
> -
> - /* Force a record to be reported if a signal was delivered. */
> - if (signr || unlikely(!audit_dummy_context()))
> + if (audit_enabled && unlikely(!audit_dummy_context()))
> __audit_seccomp(syscall, signr, code);
> }
>
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 919ad9f..548fb89 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -509,6 +509,24 @@ static void seccomp_send_sigsys(int syscall, int reason)
> }
> #endif /* CONFIG_SECCOMP_FILTER */
>
> +static u32 seccomp_max_action_to_log = SECCOMP_RET_KILL;
> +
> +static void seccomp_log(unsigned long syscall, long signr, u32 action)

Please mark this inline...

> +{
> + /* Force an audit message to be emitted when the action is not greater
> + * than the configured maximum action.
> + */
> + if (action <= seccomp_max_action_to_log)
> + return __audit_seccomp(syscall, signr, action);
> +
> + /* If the action is not an ALLOW action, let the audit subsystem decide
> + * if it should be audited based on whether the current task itself is
> + * being audited.
> + */
> + if (action != SECCOMP_RET_ALLOW)
> + return audit_seccomp(syscall, signr, action);

Based on my thoughts below, this test can actually be removed (making
the audit_seccomp() call unconditional), since callers will always be
!= RET_ALLOW.

> +}
> +
> /*
> * Secure computing mode 1 allows only read/write/exit/sigreturn.
> * To be fully secure this must be combined with rlimit
> @@ -534,7 +552,7 @@ static void __secure_computing_strict(int this_syscall)
> #ifdef SECCOMP_DEBUG
> dump_stack();
> #endif
> - audit_seccomp(this_syscall, SIGKILL, SECCOMP_RET_KILL);
> + seccomp_log(this_syscall, SIGKILL, SECCOMP_RET_KILL);
> do_exit(SIGKILL);
> }
>
> @@ -633,18 +651,19 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
> return 0;
>
> case SECCOMP_RET_ALLOW:
> + seccomp_log(this_syscall, 0, action);
> return 0;

I am extremely sensitive about anything appearing in the RET_ALLOW
case, since it's the hot path for seccomp. This adds a full function
call (which also contains a redundant test: the action IS RET_ALLOW,
so we'll never call audit_seccomp() in seccomp_log()).

While the inline request above removes the function call, it's not
clear to me if gcc is going to do the right thing here, and I'd like
to assist the branch predictor (likely separate from the other case),
so I think I'd like an open-coded test here instead:

case SECCOMP_RET_ALLOW:
/* Open-coded seccomp_log(), optimized for RET_ALLOW. */
if (unlikely(seccomp_max_action_to_log == 0))
__audit_seccomp(syscall, signr, action);
return 0;

> case SECCOMP_RET_KILL:
> default:
> - audit_seccomp(this_syscall, SIGSYS, action);
> + seccomp_log(this_syscall, SIGSYS, action);
> do_exit(SIGSYS);
> }
>
> unreachable();
>
> skip:
> - audit_seccomp(this_syscall, 0, action);
> + seccomp_log(this_syscall, 0, action);
> return -1;
> }
> #else
> @@ -910,18 +929,102 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
>
> #ifdef CONFIG_SYSCTL
>
> +/* Human readable action names for friendly sysctl interaction */

This line should be in patch 1.

> #define SECCOMP_RET_KILL_NAME "kill"
> #define SECCOMP_RET_TRAP_NAME "trap"
> #define SECCOMP_RET_ERRNO_NAME "errno"
> #define SECCOMP_RET_TRACE_NAME "trace"
> #define SECCOMP_RET_ALLOW_NAME "allow"
>
> +/* Largest strlen() of all action names */
> +#define SECCOMP_RET_MAX_NAME_LEN 5

This feels fragile... though I don't have a good suggestion yet. :P

> +
> static char seccomp_actions_avail[] = SECCOMP_RET_KILL_NAME " "
> SECCOMP_RET_TRAP_NAME " "
> SECCOMP_RET_ERRNO_NAME " "
> SECCOMP_RET_TRACE_NAME " "
> SECCOMP_RET_ALLOW_NAME;
>
> +struct seccomp_action_name {
> + u32 action;
> + const char *name;
> +};
> +
> +static struct seccomp_action_name seccomp_action_names[] = {
> + { SECCOMP_RET_KILL, SECCOMP_RET_KILL_NAME },
> + { SECCOMP_RET_TRAP, SECCOMP_RET_TRAP_NAME },
> + { SECCOMP_RET_ERRNO, SECCOMP_RET_ERRNO_NAME },
> + { SECCOMP_RET_TRACE, SECCOMP_RET_TRACE_NAME },
> + { SECCOMP_RET_ALLOW, SECCOMP_RET_ALLOW_NAME },
> + { }
> +};
> +
> +static bool seccomp_name_from_action(const char **namep, u32 action)
> +{
> + struct seccomp_action_name *cur;
> +
> + for (cur = seccomp_action_names; cur->name; cur++) {
> + if (cur->action == action) {
> + *namep = cur->name;
> + return true;
> + }
> + }
> +
> + return false;
> +}
> +
> +static bool seccomp_action_from_name(u32 *action, const char *name)
> +{
> + struct seccomp_action_name *cur;
> +
> + for (cur = seccomp_action_names; cur->name; cur++) {
> + if (!strcmp(cur->name, name)) {
> + *action = cur->action;
> + return true;
> + }
> + }
> +
> + return false;
> +}
> +
> +static int seccomp_max_action_to_log_handler(struct ctl_table *table, int write,
> + void __user *buffer, size_t *lenp,
> + loff_t *ppos)
> +{
> + char name[SECCOMP_RET_MAX_NAME_LEN + 1] = {0};

{ } preferred over { 0 }

> + int ret;
> +
> + if (write && !capable(CAP_SYS_ADMIN))
> + return -EPERM;
> +
> + if (!write) {
> + const char *namep;
> +
> + if (!seccomp_name_from_action(&namep,
> + seccomp_max_action_to_log))
> + return -EINVAL;
> +
> + strncpy(name, namep, sizeof(name) - 1);
> + }
> +
> + table->data = name;
> + table->maxlen = sizeof(name);

In the hopes of some day making the sysctl table entirely read-only,
can you add some fancy crap here for me? :) See
security/yama/yama_lsm.c's yama_dointvec_minmax(), which uses a copy
of the sysctl table on the stack.

> + ret = proc_dostring(table, write, buffer, lenp, ppos);
> + if (ret)
> + return ret;
> +
> + if (write) {
> + u32 action;
> +
> + if (!seccomp_action_from_name(&action, table->data))
> + return -EINVAL;
> +
> + seccomp_max_action_to_log = action;
> + }
> +
> + return 0;
> +}
> +
> static struct ctl_path seccomp_sysctl_path[] = {
> { .procname = "kernel", },
> { .procname = "seccomp", },
> @@ -936,6 +1039,11 @@ static struct ctl_table seccomp_sysctl_table[] = {
> .mode = 0444,
> .proc_handler = proc_dostring,
> },
> + {
> + .procname = "max_action_to_log",
> + .mode = 0644,
> + .proc_handler = seccomp_max_action_to_log_handler,
> + },
> { }
> };
>
> --
> 2.7.4
>

(Though I still wonder if a numeric would be easier...)

-Kees

--
Kees Cook
Pixel Security