Re: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns

From: Serge E. Hallyn
Date: Mon Sep 29 2014 - 10:00:18 EST


Quoting Chen, Hanxiao (chenhanxiao@xxxxxxxxxxxxxx):
> Hi,
>
> > -----Original Message-----
> > From: containers-bounces@xxxxxxxxxxxxxxxxxxxxxxxxxx
> > [mailto:containers-bounces@xxxxxxxxxxxxxxxxxxxxxxxxxx] On Behalf Of Chen
> > Hanxiao
> > Sent: Wednesday, September 24, 2014 6:00 PM
> > To: containers@xxxxxxxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> > Cc: Richard Weinberger; Serge Hallyn; Oleg Nesterov; Mateusz Guzik; David Howells;
> > Eric W. Biederman
> > Subject: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns
> >
> > If some issues occurred inside a container guest, host user
> > could not know which process is in trouble just by guest pid:
> > the users of container guest only knew the pid inside containers.
> > This will bring obstacle for trouble shooting.
> >
> > This patch adds four fields: NStgid, NSpid, NSpgid and NSsid:
> > a) In init_pid_ns, nothing changed;
> >
> > b) In one pidns, will tell the pid inside containers:
> > NStgid: 21776 5 1
> > NSpid: 21776 5 1
> > NSpgid: 21776 5 1
> > NSsid: 21729 1 0
> > ** Process id is 21776 in level 0, 5 in level 1, 1 in level 2.
> >
> > c) If pidns is nested, it depends on which pidns are you in.
> > NStgid: 5 1
> > NSpid: 5 1
> > NSpgid: 5 1
> > NSsid: 1 0
> > ** Views from level 1
> >
>
> This patch is simple, useful and safe.
> But currently there is not any feedbacks.
>
> Any comments or ideas?

Thanks, Chen. The code looks fine. My concern is that you are
exposing information which cannot be checkpointed and restarted.
In particular, if I'm inside a nested container, so I'm in pidns
level 3, then my own NSpid info, when I read it, will show the
pids at parent namespaces. If I'm restarted at the third pidns
level, only the one pid can be restored.

Now it may be fair to say "this is proc, and proc and sys show
host info which is not containerized and cannot be checkpointed
and restarted, deal with it." But I'm not sure.

There are two ways you could deal with this. One would be to
show the nspids only to the level of the reader of the file - but
I don't think you need to do that. I think you're better off
simply showing the pids up to the level of the struct pid for
the mounter of the procfs. So if I'm inside container c2 which
is inside container c1, my own /proc will only show pids which
are valid in c2 (and any child namespaces), while the /proc
mounted in c1 will show pids valid in c1 and c2 (and any children),
but not those in the init_pid_ns. It's then just up to the
container administrators to make sure that c2 cannot see c1's
/proc to confuse itself and confuddle checkpoint-restart

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/