Re: [PATCH 2/2] pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy
From: Kirill Tkhai
Date: Wed Apr 26 2017 - 12:44:13 EST
On 26.04.2017 19:32, Eric W. Biederman wrote:
> Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> writes:
>
>> On 26.04.2017 19:11, Kirill Tkhai wrote:
>>> On 26.04.2017 18:53, Oleg Nesterov wrote:
>>>> On 04/17, Kirill Tkhai wrote:
>>>>>
>>>>> +struct pidns_ioc_req {
>>>>> +/* Set vector of last pids in namespace hierarchy */
>>>>> +#define PIDNS_REQ_SET_LAST_PID_VEC 0x1
>>>>> + unsigned int req;
>>>>> + void __user *data;
>>>>> + unsigned int data_size;
>>>>> + char std_fields[0];
>>>>> +};
>>>>
>>>> see below,
>>>>
>>>>> +static long set_last_pid_vec(struct pid_namespace *pid_ns,
>>>>> + struct pidns_ioc_req *req)
>>>>> +{
>>>>> + char *str, *p;
>>>>> + int ret = 0;
>>>>> + pid_t pid;
>>>>> +
>>>>> + read_lock(&tasklist_lock);
>>>>> + if (!pid_ns->child_reaper)
>>>>> + ret = -EINVAL;
>>>>> + read_unlock(&tasklist_lock);
>>>>> + if (ret)
>>>>> + return ret;
>>>>
>>>> why do you need to check ->child_reaper under tasklist_lock? this looks pointless.
>>>>
>>>> In fact I do not understand how it is possible to hit pid_ns->child_reaper == NULL,
>>>> there must be at least one task in this namespace, otherwise you can't open a file
>>>> which has f_op == ns_file_operations, no?
>>>
>>> Sure, it's impossible to pick a pid_ns, if there is no the pid_ns's tasks. I added
>>> it under impression of
>>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=dfda351c729733a401981e8738ce497eaffcaa00
>>> but here it's completely wrong. It will be removed in v2.
>>>
>>>>> + if (req->data_size >= PAGE_SIZE)
>>>>> + return -EINVAL;
>>>>> + str = vmalloc(req->data_size + 1);
>>>>
>>>> then I don't understand why it makes sense to use vmalloc()
>>>>
>>>>> + if (!str)
>>>>> + return -ENOMEM;
>>>>> + if (copy_from_user(str, req->data, req->data_size)) {
>>>>> + ret = -EFAULT;
>>>>> + goto out_vfree;
>>>>> + }
>>>>> + str[req->data_size] = '\0';
>>>>> +
>>>>> + p = str;
>>>>> + while (p && *p != '\0') {
>>>>> + if (!ns_capable(pid_ns->user_ns, CAP_SYS_ADMIN)) {
>>>>> + ret = -EPERM;
>>>>> + goto out_vfree;
>>>>> + }
>>>>> +
>>>>> + if (sscanf(p, "%d", &pid) != 1 || pid < 0 || pid > pid_max) {
>>>>> + ret = -EINVAL;
>>>>> + goto out_vfree;
>>>>> + }
>>>>
>>>> Well, this is ioctl(), do we really want to parse the strings?
>>>>
>>>> Can't we make
>>>>
>>>> struct pidns_ioc_req {
>>>> ...
>>>> int nr_pids;
>>>> pid_t pids[0];
>>>> }
>>>>
>>>> and just use get_user() in a loop? This way we can avoid vmalloc() or anything
>>>> else altogether.
>>>
>>> Since it's a generic structure for different types of the requests, it may be extended
>>> in the future. We won't be able to add new fields, if we compose the structure the way
>>> you suggested, will we?
>>
>> Though, we may go this way if just do the fields generic:
>>
>> struct pidns_ioc_req {
>> unsigned int req;
>> unsigned int data_size;
>> union {
>> pid_t pid[0];
>> };
>> };
>>
>> Ok, I'll rework the patchset in this way.
>
> You don't need that. That is what new ioctl numbers are for.
>
> Interfaces to the kernel don't need to become multiplexors to prepare
> for the future when there is already an appropriate multiplexing
> interface in place.
That is, do you suggest to not introduce NS_SPECIFIC_IO from the first patch,
and add PIDNS_REQ_SET_LAST_PID_VEC to the list of generic ns ioctls?
...
#define NS_GET_OWNER_UID _IO(NSIO, 0x4)
#define PIDNS_REQ_SET_LAST_PID_VEC _IO(NSIO, 0x5)