Re: [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy

From: Andy Lutomirski
Date: Tue Feb 28 2017 - 15:12:58 EST


On Tue, Feb 21, 2017 at 5:26 PM, MickaÃl SalaÃn <mic@xxxxxxxxxxx> wrote:
> The seccomp(2) syscall can be use to apply a Landlock rule to the
> current process. As with a seccomp filter, the Landlock rule is enforced
> for all its future children. An inherited rule tree can be updated
> (append-only) by the owner of inherited Landlock nodes (e.g. a parent
> process that create a new rule)

Can you clarify exaclty what this type of update does? Is it
something that should be supported by normal seccomp rules as well?

> +/**
> + * landlock_run_prog - run Landlock program for a syscall

Unless this is actually specific to syscalls, s/for a syscall//, perhaps?

> + if (new_events->nodes[event_idx]->owner ==
> + &new_events->nodes[event_idx]) {
> + /* We are the owner, we can then update the node. */
> + add_landlock_rule(new_events, rule);

This is the part I don't get. Adding a rule if you're the owner (BTW,
why is ownership visible to userspace at all?) for just yourself and
future children is very different from adding it so it applies to
preexisting children too.


> + } else if (atomic_read(&current_events->usage) == 1) {
> + WARN_ON(new_events->nodes[event_idx]->owner);
> + /*
> + * We can become the new owner if no other task use it.
> + * This avoid an unnecessary allocation.
> + */
> + new_events->nodes[event_idx]->owner =
> + &new_events->nodes[event_idx];
> + add_landlock_rule(new_events, rule);
> + } else {
> + /*
> + * We are not the owner, we need to fork current_events
> + * and then add a new node.
> + */
> + struct landlock_node *node;
> + size_t i;
> +
> + node = kmalloc(sizeof(*node), GFP_KERNEL);
> + if (!node) {
> + new_events = ERR_PTR(-ENOMEM);
> + goto put_rule;
> + }
> + atomic_set(&node->usage, 1);
> + /* set the previous node after the new_events
> + * allocation */
> + node->prev = NULL;
> + /* do not increment the previous node usage */
> + node->owner = &new_events->nodes[event_idx];
> + /* rule->prev is already NULL */
> + atomic_set(&rule->usage, 1);
> + node->rule = rule;
> +
> + new_events = new_raw_landlock_events();
> + if (IS_ERR(new_events)) {
> + /* put the rule as well */
> + put_landlock_node(node);
> + return ERR_PTR(-ENOMEM);
> + }
> + for (i = 0; i < ARRAY_SIZE(new_events->nodes); i++) {
> + new_events->nodes[i] =
> + lockless_dereference(
> + current_events->nodes[i]);
> + if (i == event_idx)
> + node->prev = new_events->nodes[i];
> + if (!WARN_ON(!new_events->nodes[i]))
> + atomic_inc(&new_events->nodes[i]->usage);
> + }
> + new_events->nodes[event_idx] = node;
> +
> + /*
> + * @current_events will not be freed here because it's usage
> + * field is > 1. It is only prevented to be freed by another
> + * subject thanks to the caller of landlock_append_prog() which
> + * should be locked if needed.
> + */
> + put_landlock_events(current_events);
> + }
> + }
> + return new_events;
> +
> +put_prog:
> + bpf_prog_put(prog);
> + return new_events;
> +
> +put_rule:
> + put_landlock_rule(rule);
> + return new_events;
> +}
> +
> +/**
> + * landlock_seccomp_append_prog - attach a Landlock rule to the current process
> + *
> + * current->seccomp.landlock_events is lazily allocated. When a process fork,
> + * only a pointer is copied. When a new event is added by a process, if there
> + * is other references to this process' landlock_events, then a new allocation
> + * is made to contains an array pointing to Landlock rule lists. This design
> + * has low-performance impact and is memory efficient while keeping the
> + * property of append-only rules.
> + *
> + * @flags: not used for now, but could be used for TSYNC
> + * @user_bpf_fd: file descriptor pointing to a loaded Landlock rule
> + */
> +#ifdef CONFIG_SECCOMP_FILTER
> +int landlock_seccomp_append_prog(unsigned int flags, const char __user *user_bpf_fd)
> +{
> + struct landlock_events *new_events;
> + struct bpf_prog *prog;
> + int bpf_fd;
> +
> + /* force no_new_privs to limit privilege escalation */
> + if (!task_no_new_privs(current))
> + return -EPERM;
> + /* will be removed in the future to allow unprivileged tasks */
> + if (!capable(CAP_SYS_ADMIN))
> + return -EPERM;
> + if (!user_bpf_fd)
> + return -EFAULT;
> + if (flags)
> + return -EINVAL;
> + if (copy_from_user(&bpf_fd, user_bpf_fd, sizeof(bpf_fd)))
> + return -EFAULT;
> + prog = bpf_prog_get(bpf_fd);
> + if (IS_ERR(prog))
> + return PTR_ERR(prog);
> +
> + /*
> + * We don't need to lock anything for the current process hierarchy,
> + * everything is guarded by the atomic counters.
> + */
> + new_events = landlock_append_prog(current->seccomp.landlock_events, prog);

Do you need to check that it's the right *kind* of bpf prog or is that
handled elsewhere?

--Andy