Re: [PATCH 1/1] /proc/$PID/status : show list NSpid data based on current process namespace.

From: Eric W. Biederman
Date: Mon Jun 01 2015 - 07:27:49 EST




On June 1, 2015 3:16:57 AM CDT, Kuenhwan Kwak <kh243.kwak@xxxxxxxxxxx> wrote:
>
>On 05/30/2015 07:21 AM, Andrew Morton wrote:
>> On Fri, 29 May 2015 11:57:21 +0900 Kuenhwan Kwak
><kh243.kwak@xxxxxxxxxxx> wrote:
>>
>>> This patch helps creating a pid mapping data to parent processes.
>>>
>>> Reading 'NSpid' field in '/proc/$PID/status' is currently a simple
>>> way to getting child pid from parent pid in userspace. But this
>field
>>> supplies only single direction mapping('parent pid' to 'child pid').
>>> If parent process want to translate child pid to current namespace
>pid,
>>> there is no way to get except full searching in current procfs.
>>>
>>> This patch will helps in getting current namespace pid by reading
>child
>>> procfs file without any side effects.
>>>
>>> For example, Process id is 24771 in level 0, 435 in level 1.
>>> a) The output of '/proc/24771/status' in level 0 namespace.
>>> NSpid : 24771 435
>>>
>>> b) The output of '/proc/435/status' in level 1 namespace.
>>> NSpid : 435
>>>
>>> c) Process in level 0 mount level1 proc to '/var/child/proc'
>>> after setns(). The output of '/var/child/proc/435/status' is
>>> NSpid : 24771 435
>>>
>>> ...
>>>
>>> --- a/fs/proc/array.c
>>> +++ b/fs/proc/array.c
>>> @@ -83,6 +83,7 @@
>>> #include <linux/tracehook.h>
>>> #include <linux/string_helpers.h>
>>> #include <linux/user_namespace.h>
>>> +#include <linux/sched.h>
>>>
>>> #include <asm/pgtable.h>
>>> #include <asm/processor.h>
>>> @@ -149,6 +150,9 @@ static inline void task_state(struct seq_file
>*m, struct pid_namespace *ns,
>>> const struct cred *cred;
>>> pid_t ppid, tpid = 0, tgid, ngid;
>>> unsigned int max_fds = 0;
>>> +#ifdef CONFIG_PID_NS
>>> + struct pid_namespace *current_pid_ns =
>task_active_pid_ns(current);
>>> +#endif
>>>
>>> rcu_read_lock();
>>> ppid = pid_alive(p) ?
>>> @@ -198,19 +202,19 @@ static inline void task_state(struct seq_file
>*m, struct pid_namespace *ns,
>>>
>>> #ifdef CONFIG_PID_NS
>>> seq_puts(m, "\nNStgid:");
>>> - for (g = ns->level; g <= pid->level; g++)
>>> + for (g = current_pid_ns->level; g <= pid->level; g++)
>>> seq_printf(m, "\t%d",
>>> task_tgid_nr_ns(p, pid->numbers[g].ns));
>>> seq_puts(m, "\nNSpid:");
>>> - for (g = ns->level; g <= pid->level; g++)
>>> + for (g = current_pid_ns->level; g <= pid->level; g++)
>>> seq_printf(m, "\t%d",
>>> task_pid_nr_ns(p, pid->numbers[g].ns));
>>> seq_puts(m, "\nNSpgid:");
>>> - for (g = ns->level; g <= pid->level; g++)
>>> + for (g = current_pid_ns->level; g <= pid->level; g++)
>>> seq_printf(m, "\t%d",
>>> task_pgrp_nr_ns(p, pid->numbers[g].ns));
>>> seq_puts(m, "\nNSsid:");
>>> - for (g = ns->level; g <= pid->level; g++)
>>> + for (g = current_pid_ns->level; g <= pid->level; g++)
>>> seq_printf(m, "\t%d",
>>> task_session_nr_ns(p, pid->numbers[g].ns));
>> These changes alter current behaviour, don't they? How do we know
>this
>> won't impact existing userspace code?
>
>According to proc_mount() function, 'ns' value is came from current
>process also.
>and procfs is accessed by same pid namespace processes usually. So
>there is no impact of using current value.
>
>This change is only effective if parent ns process access to child
>procfs.
>In that case, print out more NSpid info for parent ns process

Using current to calculate the value of a file is almost
always a bug (quite frequently security related). The
value of a file should be fixed at the time it is being
opened and usually every opener should have the
same behavior. If the value of a file is not fixed at
open time file descriptor passing and caching do not
work. Certainly using current is not something that
should be done casually.

The proposed change loses the ability to know what your set of pids are from the pid namespace of the mounted instance of proc.

I do not see how this patch buys anything. If you want
to know the full set of pids from the perspective of
your pie namespace all that needs to be done is to
look in the copy of proc that matches your own kid
namespace. In fact it is necessary if you do not want
a full scan of proc because otherwise you can not
know which directory to look in to find your own
children.

In short look in the appropriate copy of proc and you do not need this ill considered patch.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/