Re: [PATCH v3 1/2] fork: extend clone3() to support CLONE_SET_TID

From: Oleg Nesterov
Date: Wed Aug 07 2019 - 11:48:33 EST


On 08/06, Adrian Reber wrote:
>
> @@ -2530,12 +2530,14 @@ noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
> struct clone_args __user *uargs,
> size_t size)
> {
> + struct pid_namespace *pid_ns = task_active_pid_ns(current);
> struct clone_args args;
>
> if (unlikely(size > PAGE_SIZE))
> return -E2BIG;
>
> - if (unlikely(size < sizeof(struct clone_args)))
> + /* The struct needs to be at least the size of the original struct. */
> + if (size < (sizeof(struct clone_args) - sizeof(__aligned_u64)))
> return -EINVAL;

slightly off-topic, but with or without this patch I do not understand
-EINVAL. Can't we replace this check with

if (size < sizeof(struct clone_args))
memset((void*)&args + size, sizeof(struct clone_args) - size, 0);

?

this way we can new members at the end of clone_args and this matches
the "if (size > sizeof(struct clone_args))" block below which promises
that whatever we add into clone_args a zero value should work.


And if we do this

> + if (size == sizeof(struct clone_args)) {
> + /* Only check permissions if set_tid is actually set. */
> + if (args.set_tid &&
> + !ns_capable(pid_ns->user_ns, CAP_SYS_ADMIN))
> + return -EPERM;
> + kargs->set_tid = args.set_tid;
> + }

we can move this check into clone3_args_valid() or even copy_process()

if (kargs->set_tid) {
if (!ns_capable(...))
return -EPERM;
}


Either way,

> @@ -2585,6 +2595,10 @@ static bool clone3_args_valid(const struct kernel_clone_args *kargs)
> if (kargs->flags & ~CLONE_LEGACY_FLAGS)
> return false;
>
> + /* Fail if set_tid is invalid */
> + if (kargs->set_tid < 0)
> + return false;

I think it would be more clean to do this along with ns_capable() check,
or along with the "set_tid >= pid_max" check in alloc_pid().

I won't insist, but I do not really like the fact we check set_tid 3 times
in copy_clone_args_from_user(), clone3_args_valid(), and alloc_pid().

Oleg.