Re: [PATCH v2 2/4] seccomp: Add sysctl to configure actions that should be logged

From: Tyler Hicks
Date: Fri Feb 10 2017 - 18:56:33 EST


On 02/07/2017 06:24 PM, Kees Cook wrote:
> On Thu, Feb 2, 2017 at 9:37 PM, Tyler Hicks <tyhicks@xxxxxxxxxxxxx> wrote:
>> Administrators can write to this sysctl to set the maximum seccomp
>> action that should be logged. Any actions with values greater than
>> what's written to the sysctl will not be logged.
>>
>> For example, all SECCOMP_RET_KILL, SECCOMP_RET_TRAP, and
>> SECCOMP_RET_ERRNO actions would be logged if "errno" were written to the
>> sysctl. SECCOMP_RET_TRACE and SECCOMP_RET_ALLOW actions would not be
>> logged since their values are higher than SECCOMP_RET_ERRNO.
>>
>> The path to the sysctl is:
>>
>> /proc/sys/kernel/seccomp/max_action_to_log
>
> /me looks for new bikeshed paint.
>
> How about .../seccomp/action_log ? (And a corresponding
> s/max_action_to_log/action_log/, if that looks readable...) I think
> four words is just too long. :)

Kees and I discussed this a bit over IRC today. We settled on
log_max_action for v3 of the patch set.

>
>> The actions_avail sysctl can be read to discover the valid action names
>> that can be written to the max_action_to_log sysctl. The actions_avail
>> sysctl is also useful in understanding the ordering of actions used when
>> deciding the maximum action to log.
>>
>> The default setting for the sysctl is to only log SECCOMP_RET_KILL
>> actions which matches the existing behavior.
>>
>> There's one important exception to this sysctl. If a task is
>> specifically being audited, meaning that an audit context has been
>> allocated for the task, seccomp will log all actions other than
>> SECCOMP_RET_ALLOW despite the value of max_action_to_log. This exception
>> preserves the existing auditing behavior of tasks with an allocated
>> audit context.
>>
>> Signed-off-by: Tyler Hicks <tyhicks@xxxxxxxxxxxxx>
>> ---
>> include/linux/audit.h | 6 +--
>> kernel/seccomp.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++--
>> 2 files changed, 112 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/linux/audit.h b/include/linux/audit.h
>> index f51fca8d..e0d95fc 100644
>> --- a/include/linux/audit.h
>> +++ b/include/linux/audit.h
>> @@ -315,11 +315,7 @@ void audit_core_dumps(long signr);
>>
>> static inline void audit_seccomp(unsigned long syscall, long signr, int code)
>> {
>> - if (!audit_enabled)
>> - return;
>> -
>> - /* Force a record to be reported if a signal was delivered. */
>> - if (signr || unlikely(!audit_dummy_context()))
>> + if (audit_enabled && unlikely(!audit_dummy_context()))
>> __audit_seccomp(syscall, signr, code);
>> }
>>
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index 919ad9f..548fb89 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -509,6 +509,24 @@ static void seccomp_send_sigsys(int syscall, int reason)
>> }
>> #endif /* CONFIG_SECCOMP_FILTER */
>>
>> +static u32 seccomp_max_action_to_log = SECCOMP_RET_KILL;
>> +
>> +static void seccomp_log(unsigned long syscall, long signr, u32 action)
>
> Please mark this inline...

Will do.

>
>> +{
>> + /* Force an audit message to be emitted when the action is not greater
>> + * than the configured maximum action.
>> + */
>> + if (action <= seccomp_max_action_to_log)
>> + return __audit_seccomp(syscall, signr, action);
>> +
>> + /* If the action is not an ALLOW action, let the audit subsystem decide
>> + * if it should be audited based on whether the current task itself is
>> + * being audited.
>> + */
>> + if (action != SECCOMP_RET_ALLOW)
>> + return audit_seccomp(syscall, signr, action);
>
> Based on my thoughts below, this test can actually be removed (making
> the audit_seccomp() call unconditional), since callers will always be
> != RET_ALLOW.

Agreed.

>
>> +}
>> +
>> /*
>> * Secure computing mode 1 allows only read/write/exit/sigreturn.
>> * To be fully secure this must be combined with rlimit
>> @@ -534,7 +552,7 @@ static void __secure_computing_strict(int this_syscall)
>> #ifdef SECCOMP_DEBUG
>> dump_stack();
>> #endif
>> - audit_seccomp(this_syscall, SIGKILL, SECCOMP_RET_KILL);
>> + seccomp_log(this_syscall, SIGKILL, SECCOMP_RET_KILL);
>> do_exit(SIGKILL);
>> }
>>
>> @@ -633,18 +651,19 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
>> return 0;
>>
>> case SECCOMP_RET_ALLOW:
>> + seccomp_log(this_syscall, 0, action);
>> return 0;
>
> I am extremely sensitive about anything appearing in the RET_ALLOW
> case, since it's the hot path for seccomp. This adds a full function
> call (which also contains a redundant test: the action IS RET_ALLOW,
> so we'll never call audit_seccomp() in seccomp_log()).
>
> While the inline request above removes the function call, it's not
> clear to me if gcc is going to do the right thing here, and I'd like
> to assist the branch predictor (likely separate from the other case),
> so I think I'd like an open-coded test here instead:
>
> case SECCOMP_RET_ALLOW:
> /* Open-coded seccomp_log(), optimized for RET_ALLOW. */
> if (unlikely(seccomp_max_action_to_log == 0))
> __audit_seccomp(syscall, signr, action);
> return 0;

That makes sense.

>
>> case SECCOMP_RET_KILL:
>> default:
>> - audit_seccomp(this_syscall, SIGSYS, action);
>> + seccomp_log(this_syscall, SIGSYS, action);
>> do_exit(SIGSYS);
>> }
>>
>> unreachable();
>>
>> skip:
>> - audit_seccomp(this_syscall, 0, action);
>> + seccomp_log(this_syscall, 0, action);
>> return -1;
>> }
>> #else
>> @@ -910,18 +929,102 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
>>
>> #ifdef CONFIG_SYSCTL
>>
>> +/* Human readable action names for friendly sysctl interaction */
>
> This line should be in patch 1.
>
>> #define SECCOMP_RET_KILL_NAME "kill"
>> #define SECCOMP_RET_TRAP_NAME "trap"
>> #define SECCOMP_RET_ERRNO_NAME "errno"
>> #define SECCOMP_RET_TRACE_NAME "trace"
>> #define SECCOMP_RET_ALLOW_NAME "allow"
>>
>> +/* Largest strlen() of all action names */
>> +#define SECCOMP_RET_MAX_NAME_LEN 5
>
> This feels fragile... though I don't have a good suggestion yet. :P

I agree and I also don't have a good solution. I didn't like having to
hard code it.

>
>> +
>> static char seccomp_actions_avail[] = SECCOMP_RET_KILL_NAME " "
>> SECCOMP_RET_TRAP_NAME " "
>> SECCOMP_RET_ERRNO_NAME " "
>> SECCOMP_RET_TRACE_NAME " "
>> SECCOMP_RET_ALLOW_NAME;
>>
>> +struct seccomp_action_name {
>> + u32 action;
>> + const char *name;
>> +};
>> +
>> +static struct seccomp_action_name seccomp_action_names[] = {
>> + { SECCOMP_RET_KILL, SECCOMP_RET_KILL_NAME },
>> + { SECCOMP_RET_TRAP, SECCOMP_RET_TRAP_NAME },
>> + { SECCOMP_RET_ERRNO, SECCOMP_RET_ERRNO_NAME },
>> + { SECCOMP_RET_TRACE, SECCOMP_RET_TRACE_NAME },
>> + { SECCOMP_RET_ALLOW, SECCOMP_RET_ALLOW_NAME },
>> + { }
>> +};
>> +
>> +static bool seccomp_name_from_action(const char **namep, u32 action)
>> +{
>> + struct seccomp_action_name *cur;
>> +
>> + for (cur = seccomp_action_names; cur->name; cur++) {
>> + if (cur->action == action) {
>> + *namep = cur->name;
>> + return true;
>> + }
>> + }
>> +
>> + return false;
>> +}
>> +
>> +static bool seccomp_action_from_name(u32 *action, const char *name)
>> +{
>> + struct seccomp_action_name *cur;
>> +
>> + for (cur = seccomp_action_names; cur->name; cur++) {
>> + if (!strcmp(cur->name, name)) {
>> + *action = cur->action;
>> + return true;
>> + }
>> + }
>> +
>> + return false;
>> +}
>> +
>> +static int seccomp_max_action_to_log_handler(struct ctl_table *table, int write,
>> + void __user *buffer, size_t *lenp,
>> + loff_t *ppos)
>> +{
>> + char name[SECCOMP_RET_MAX_NAME_LEN + 1] = {0};
>
> { } preferred over { 0 }

No problem.

>
>> + int ret;
>> +
>> + if (write && !capable(CAP_SYS_ADMIN))
>> + return -EPERM;
>> +
>> + if (!write) {
>> + const char *namep;
>> +
>> + if (!seccomp_name_from_action(&namep,
>> + seccomp_max_action_to_log))
>> + return -EINVAL;
>> +
>> + strncpy(name, namep, sizeof(name) - 1);
>> + }
>> +
>> + table->data = name;
>> + table->maxlen = sizeof(name);
>
> In the hopes of some day making the sysctl table entirely read-only,
> can you add some fancy crap here for me? :) See
> security/yama/yama_lsm.c's yama_dointvec_minmax(), which uses a copy
> of the sysctl table on the stack.

Will do. I'll deviate slightly from yama_dointvec_minmax(). To make it
clear that the ctl_table param shouldn't be modified, I'm going to name
it ro_table and then the stack variable will be named table.

Tyler

>
>> + ret = proc_dostring(table, write, buffer, lenp, ppos);
>> + if (ret)
>> + return ret;
>> +
>> + if (write) {
>> + u32 action;
>> +
>> + if (!seccomp_action_from_name(&action, table->data))
>> + return -EINVAL;
>> +
>> + seccomp_max_action_to_log = action;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> static struct ctl_path seccomp_sysctl_path[] = {
>> { .procname = "kernel", },
>> { .procname = "seccomp", },
>> @@ -936,6 +1039,11 @@ static struct ctl_table seccomp_sysctl_table[] = {
>> .mode = 0444,
>> .proc_handler = proc_dostring,
>> },
>> + {
>> + .procname = "max_action_to_log",
>> + .mode = 0644,
>> + .proc_handler = seccomp_max_action_to_log_handler,
>> + },
>> { }
>> };
>>
>> --
>> 2.7.4
>>
>
> (Though I still wonder if a numeric would be easier...)
>
> -Kees
>


Attachment: signature.asc
Description: OpenPGP digital signature