Re: [PATCH 2/2] pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy
From: Kirill Tkhai
Date: Wed Apr 26 2017 - 12:12:07 EST
On 26.04.2017 18:53, Oleg Nesterov wrote:
> On 04/17, Kirill Tkhai wrote:
>>
>> +struct pidns_ioc_req {
>> +/* Set vector of last pids in namespace hierarchy */
>> +#define PIDNS_REQ_SET_LAST_PID_VEC 0x1
>> + unsigned int req;
>> + void __user *data;
>> + unsigned int data_size;
>> + char std_fields[0];
>> +};
>
> see below,
>
>> +static long set_last_pid_vec(struct pid_namespace *pid_ns,
>> + struct pidns_ioc_req *req)
>> +{
>> + char *str, *p;
>> + int ret = 0;
>> + pid_t pid;
>> +
>> + read_lock(&tasklist_lock);
>> + if (!pid_ns->child_reaper)
>> + ret = -EINVAL;
>> + read_unlock(&tasklist_lock);
>> + if (ret)
>> + return ret;
>
> why do you need to check ->child_reaper under tasklist_lock? this looks pointless.
>
> In fact I do not understand how it is possible to hit pid_ns->child_reaper == NULL,
> there must be at least one task in this namespace, otherwise you can't open a file
> which has f_op == ns_file_operations, no?
Sure, it's impossible to pick a pid_ns, if there is no the pid_ns's tasks. I added
it under impression of
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=dfda351c729733a401981e8738ce497eaffcaa00
but here it's completely wrong. It will be removed in v2.
>> + if (req->data_size >= PAGE_SIZE)
>> + return -EINVAL;
>> + str = vmalloc(req->data_size + 1);
>
> then I don't understand why it makes sense to use vmalloc()
>
>> + if (!str)
>> + return -ENOMEM;
>> + if (copy_from_user(str, req->data, req->data_size)) {
>> + ret = -EFAULT;
>> + goto out_vfree;
>> + }
>> + str[req->data_size] = '\0';
>> +
>> + p = str;
>> + while (p && *p != '\0') {
>> + if (!ns_capable(pid_ns->user_ns, CAP_SYS_ADMIN)) {
>> + ret = -EPERM;
>> + goto out_vfree;
>> + }
>> +
>> + if (sscanf(p, "%d", &pid) != 1 || pid < 0 || pid > pid_max) {
>> + ret = -EINVAL;
>> + goto out_vfree;
>> + }
>
> Well, this is ioctl(), do we really want to parse the strings?
>
> Can't we make
>
> struct pidns_ioc_req {
> ...
> int nr_pids;
> pid_t pids[0];
> }
>
> and just use get_user() in a loop? This way we can avoid vmalloc() or anything
> else altogether.
Since it's a generic structure for different types of the requests, it may be extended
in the future. We won't be able to add new fields, if we compose the structure the way
you suggested, will we?