RE: [PATCH v2] ns: introduce getnspid syscall

From: chenhanxiao@xxxxxxxxxxxxxx
Date: Thu Jun 26 2014 - 06:20:12 EST




> -----Original Message-----
> From: Serge Hallyn [mailto:serge.hallyn@xxxxxxxxxx]
> Sent: Wednesday, June 25, 2014 10:39 PM
> To: Chen, Hanxiao/é æé
> Cc: Serge E. Hallyn; Eric W. Biederman; Richard Weinberger;
> containers@xxxxxxxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Oleg
> Nesterov; David Howells; Al Viro; linux-api@xxxxxxxxxxxxxxx
> > > > > >
> > > > >
> > > > > I don't think that adding a new system call for this is a good solution.
> > > > > We need a more generic way. I bet people are interested in more than just
> > > PID
> > > > > numbers.
> > > >
> > > > Could you please give some hints on how to expand this interface?
> > > >
> > > > >
> > > > > I agree with Eric that a procfs solution is more appropriate.
> > > > >
> > > >
> > > > Procfs is a good solution, but syscall is not bad though.
> > >
> > > I might be inclined to agree, except that in this case you are still
> > > needing mounted procfs anyway to get the proc/$pid/ns/pid fds.
> > >
> > > I'm sorry, I've not been watching this thread, so this probably has been
> > > considered and decided against, but I'm going to ask anyway. Keeping
> > > in mind both checkpoint-restart and and introspection for use in a
> > > setns'd commend, why not make it
> > >
> > > pid_t getnspid(pid_t query_pid, pid_t observer_pid)
> > >
> > > which returns the process id of query_pid as seen from observer_pid's
> > > pidns?
> > >
> >
> > But this could be confused in nested ns.
> >
> > Ex:
> > (thanks for Pavel's figure)
> > init_pid_ns ns1 ns2
> > t1 2
> > t2 `- 3 1
> > t3 `- 4 `- 5 1
> > t4 5
> >
> > a) getnspid(1, 1):
> > We expected it could return t2's pid(2nd 1 as pid
>
> Clearly the passed-in pids should be interpreted as relative
> to current's pidns. There can be no ambiguity at that point,
> unless I'm overlooking something.
>

Default to current's pidns looks reasonable.
But nested namespace will still bring trouble to us.
Since the middle level of namespace looks less attractive to users,
how about ignore them, and just show the deepest level's pid?

Ex:
(Thanks for Pavel's figure again)
init_pid_ns ns1 ns2
t1 2
t2 `- 3 1
t3 `- 4 `- 5 1
t4 `- 5 `-7 `- 2

1. In init_pid_ns:
a) getnspid(2, 1):
returns 2 (t1)

b) getnspid(1, 3):
returns 3 (t2)

c) getnspid(1, 4):
returns 4 (t3)

getnspid(2, 4):
returns 5 (t3)

2. In ns1
a) getnspid(2, 5):
returns 7 (t4)

How do you like this idea?

Thanks,
- Chen

> > such as systemd in init_pid_ns),
> > but t3'pid is also an appropriate result.
> > We may get more than one returns.
> >
> > b) getnspid(5, 1):
> > (1st 5 was expected as pid in ns1)
> > t3'pid and t4's pid could both be the answer.
> > We could not determine which one is what we want.
> >
> > So something unique like fds of ns should be
> > a better reference.
> >
> > Thanks,
> > - Chen
> >
> > >
> > > > Procfs works for me, but that seems could not fit
> > > > Pavel's requirement.
> > > > His opinion is that a syscall is a more generic interface
> > > > than proc files, and also very helpful.
> > > > And syscall could tell whether a pid lives in a specific pid namespace,
> > > > much convenient than procfs.
> > > >
> > > > Thanks,
> > > > - Chen
> > >
> > > > _______________________________________________
> > > > Containers mailing list
> > > > Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
> > > > https://lists.linuxfoundation.org/mailman/listinfo/containers
> >
> > _______________________________________________
> > Containers mailing list
> > Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
> > https://lists.linuxfoundation.org/mailman/listinfo/containers