On Tue, Mar 13, 2018 at 2:44 PM, Nagarathnam Muthusamy
<nagarathnam.muthusamy@xxxxxxxxxx> wrote:
If there are two containers that use the same UID range,
On 03/13/2018 02:28 PM, Jann Horn wrote:
On Tue, Mar 13, 2018 at 2:20 PM, Nagarathnam Muthusamy
<nagarathnam.muthusamy@xxxxxxxxxx> wrote:
On 03/13/2018 01:47 PM, Jann Horn wrote:How do you do that in a race-free manner?
On Mon, Mar 12, 2018 at 10:18 AM, <nagarathnam.muthusamy@xxxxxxxxxx>
wrote:
Resending the RFC with participants of previous discussionsHow are you dealing with PID reuse?
in the list.
Following patch which is a variation of a solution discussed
in https://lwn.net/Articles/736330/ provides the users of
pid namespace, the functionality of pid translation between
namespaces using a namespace identifier. The topic of
pid translation has been discussed in the community few times
but there has always been a resistance to adding new solution
for this problem.
I will outline the planned usecase of pid namespace by oracle
database and explain why any of the existing solution cannot
be used to solve their problem.
Consider a system in which several PID namespaces with multiple
nested levels exists in parallel with monitor processes managing
all the namespaces. PID translation is required for controlling
and accessing information about the processes by the monitors
and other processes down the hierarchy of namespaces. Controlling
primarily involves sending signals or using ptrace by a process in
parent namespace on any of the processes in its child namespace.
Accessing information deals with the reading /proc/<pid>/* files
of processes in child namespace. None of the processes have
root/CAP_SYS_ADMIN privileges.
We have a monitor process which keeps track of the aliveness of
important processes. When a process dies, monitor makes a note of
it and hence detects if pid is reused.
AFAIK, the monitor runs periodically to check the aliveness of the processes
and this period is too short for pids to recycle. I will get back with more
information
on this if any other mechanisms are in place.
I thought it should have access to those procfs files to satisfy the
But the translator doesn't actually need to have access to those+ */AFAICS this proposal breaks the visibility restrictions that
+SYSCALL_DEFINE3(translate_pid, pid_t, pid, u64, source,
+ u64, target)
+{
+ struct pid_namespace *source_ns = NULL, *target_ns = NULL;
+ struct pid *struct_pid;
+ struct pid_namespace *ph;
+ struct hlist_bl_head *shead = NULL;
+ struct hlist_bl_head *thead = NULL;
+ struct hlist_bl_node *dup_node;
+ pid_t result;
+
+ if (!source) {
+ source_ns = &init_pid_ns;
+ } else {
+ shead = pid_ns_hash_head(pid_ns_hash, source);
+ hlist_bl_lock(shead);
+ hlist_bl_for_each_entry(ph, dup_node, shead, node) {
+ if (source == ph->ns.ns_id) {
+ source_ns = ph;
+ break;
+ }
+ }
+ if (!source_ns) {
+ hlist_bl_unlock(shead);
+ return -EINVAL;
+ }
+ }
+ if (!ptrace_may_access(source_ns->child_reaper,
+ PTRACE_MODE_READ_FSCREDS)) {
namespaces normally create. If there are two namespaces-based
containers that use the same UID range, I don't think they should be
able to learn information about each other, such as which PIDs are in
use in the other container; but as far as I can tell, your proposal
makes it possible to do that (unless an LSM or so is interfering). I
would prefer it if this API required visibility of the targeted PID
namespaces in the caller's PID namespace.
I am trying to simulate the same access restrictions allowed
on a process's /proc/<pid>/ns/pid file. If the translator has
access to /proc/<pid>/ns/pid file of both source and destination
namespaces, shouldn't it be allowed to translate the pid between
them?
procfs files, right?
visibility constraint that targeted PID namespaces should be visible
in caller's PID namespace and ptrace_may_access checks that
constraint.
ptrace_may_access() checks from a process in one container on a
process in another container can pass. Normally, you just can't even
reach the ptrace_may_access() checks because you can't reference
processes in another container in any way.
Will look into this. Can you point me to the specifics of the
By the way, a related concern: The use of global identifiers will
probably also negatively affect Checkpoint/Restore In Userspace?